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ABSTRACT 

Data-centric dynamic systems are systems where both the process 
controUing the dynamics and the manipulation of data are equally 
central. Recently such kinds of systems are increasingly attract- 
ing the interest of the scientific community, especially in their vari- 
ant called artifact-centric business processes. In this paper we 
study verification of (first-order) /.i-calculus variants over relational 
data-centric dynamic systems, where data are represented by a full- 
fledged relational database, and the process is described in terms 
of atomic actions that evolve the database. The execution of such 
actions may involve calls to external services, providing fresh data 
inserted into the system. As a result such systems are typically 
infinite-state. We show that verification is undecidable in general, 
and we isolate notable cases, where decidability is achieved. Specif- 
ically we start by considering service calls that return values de- 
terministically (depending only on passed parameters). We show 
that in a ^-calculus variant that preserves knowledge of objects ap- 
peared along a run we get decidability under the assumption that the 
fresh data introduced along a run are bounded, though they might 
not be bounded in the overall system. In fact we tie such a result to 
a notion related to weak acyclicity studied in data exchange. Then, 
we move to nondeterministic services where the assumption of data 
bounded run would result in a bound on the service calls that can be 
invoked during the execution and hence would be too restrictive. So 
we investigate decidability under the assumption that knowledge of 
objects is preserved only if they are continuously present. We show 
that if infinitely many values occur in a run but do not accumulate 
in the same state, then we get again decidability. We give syntac- 
tic conditions to avoid this accumulation through the novel notion 
of "generate-recall acyclicity", which takes into consideration that 
every service call activation generates new values that cannot be 
accumulated indefinitely. 

1. INTRODUCTION 

Data-centric dynamic systems (DCDSs) are systems where both 
the process controlling the dynamics and the manipulated data are 
equally central. Recently such kinds of systems are increasingly at- 
tracting the interest of the scientific community. In particular, the so 



called artifact-centric approach to modeling business processes has 
emerged, with the fundamental characteristic of considering both 
data and processes as first-class citizens in service design and anal- 
ysis |32||26|[T8llI5ll36i riJ. This holistic view of data and processes 
together promises to avoid the notorious discrepancy between data 
modeling and process modeling of more traditional approaches that 
consider these two aspects separately ||7ll6l. 

DCDSs are constituted by (i) a data layer, which is used to hold 
the relevant infoimation to be manipulated by the system, and (ii) 
a process layer formed by the invokable (atomic) actions and a 
process based on them. Such a process characterizes the dynamic 
behavior of the system. Executing an action has effects on the data 
manipulated by the system, on the process state, and on the infor- 
mation exchanged with the external world. 

DCDSs deeply challenge formal verification by requiring simul- 
taneous attention to both data and processes: indeed, on the one 
hand they deal with full-fledged processes and require analysis in 
terms of sophisticated temporal properties [17]; on the other hand, 
the presence of possibly unbounded data makes the usual analysis 
based on model checking of finite-state systems impossible in gen- 
eral, since, when data evolution is taken into account, the whole 
system becomes infinite-state. 

In this paper we study relational DCDSs, where data are repre- 
sented by a full-fledged relational database, and the process is de- 
scribed in terms of atomic actions that evolve the database. The ex- 
ecution of such actions may involve calls to external services, pro- 
viding fresh data inserted into the system. As a result such systems 
are infinite-state in general. In particular, actions are characterized 
using conditional effects. Effects are specified using first-order 
(FO) queries to extract from the current database the objects we 
want to persist in the next state, and using conjunctive queries on 
these objects to generate the facts that are true in the next state. In 
addition, to finalize the next state we call external services (func- 
tion calls) that provide new information and objects coming from 
the external world. 

On top of such a framework, we introduce powerful verification 
logics, which are FO variants of /x-calculus 1291 1331 l22l 1131 . /i- 
calculus is well known to be more expressive than virtually all tem- 
poral logics used in verification, including CTL, LTL, CTL*, PDL, 
and many others. Our approach is remarkably robust: while it 
is common to use simpler logics like CTL and LTL towards veri- 
fication decidability, our decidability results hold for significantly 
more expressive /i-calculus variants, and thus carry over to all these 
other logics. Our variants of /i-calculus are based on first-order 
queries over data in the states of the DCDS, and allow for first- 
order quantification across states (within and across runs), though 
in a controlled way. No limitations whatsoever are instead put on 
the fixpoint formulae, which are the key element of the /i-calculus. 



In particular we consider two variants of /^-calculus. The first 
variant is called ^iCa, and requires that first-order quantification 
across states be always bounded to the active domain of the state 
where the quantification is evaluated. This quantification mecha- 
nism indirectly preserves, at any point, knowledge of objects that 
appeared in the history so far, even if they disappeared in the mean- 
time. The second variant, called ^Cp, restricts the first-order quan- 
tification in ^Ca by requiring that only quantified object that are 
still present in the current domain are of interest as we move from 
one state to the next . That is, knowledge of objects is preserved 
only if they are continuously present. For these two logics we de- 
fine novel notions of bisimulation, which we exploit to prove our 
results. 

We show that verification of both ^Ca and ^Cp is undecidable 
in general. In fact we get undecidability even ruling out first-order 
quantification and branching time. However we isolate two notable 
decidable cases. Specifically we start by considering service calls 
that return values deterministically (depending only on passed pa- 
rameters). We show that verification of ^Ca properties is decid- 
able under the assumption that the cardinality of fresh data intro- 
duced along each run is bounded {run-bounded DCDSs), though 
it need not be bounded across runs. Decidability is therefore not 
obvious, given that the logic permits quantification over values oc- 
curring across (potentially infinitely many) branching run continu- 
ations. Run-boundedness is a semantic property which we show 
undecidable to check, but for which we propose a sufficient syn- 
tactic condition related to the notion of weak acyclicity studied in 
data exchange f231. Then, we move to nondeterministic services 
where same-argument service calls possibly return different values 
at different moments in time. To exploit the results on run-bounded 
DCDSs in this case we would have to limit the number of service 
calls that can be invoked during the execution, which would be a 
too restrictive condition on the form of DCDSs. So we focus on 
the above fiCp fragment of ^Ca- We show that if infinitely many 
values occur in a run but do not accumulate in the same state (our 
system is then called state-bounded) then fj,Cp verification is decid- 
able. This comes as a pleasant surprise, given that when compared 
to run-boundedness, state-boundedness permits an additional kind 
of data unboundedness {within the run, as opposed to only across 
runs). State-boundedness is a semantic property as well, and we 
show that checking it is undecidable. We then give a novel syn- 
tactic condition, "generate-recall acyclicity", which suffices to en- 
force that if a service generates new values by being called an un- 
bounded number of times, then these values cannot be accumulated 
("recalled") indefinitely. 

The rest of the paper is organized as follows. Sec. |2]introduces 
(relational) DCDS's. Sec.[3]introduces verification of DCDS's and 
the two variants of /i-calculus that we consider. Sec. |4] focus the 
analysis of DCDS's under the assumption that external service calls 
behave deterministically. Sec. |5] consider the case in which exter- 
nal service calls behave nondeteiininistically. Sec.|6]discusses the 
various notions introduced Sec.|7]reports on related work. Finally, 
Sec. [8] concludes the paper. All proofs are given in the appendix, 
which also includes a full-fledged example of a DCDS. 



2. DATA-CENTRIC DYNAMIC SYSTEMS 

In this section, we introduce the notion of (relational) data- 
centric dynamic system, or simply DCDS. A DCDS is a pair 
S = {T>, V) formed by two interacting layers: a data layer T) 
and a process layer V over it. Intuitively, the data layer keeps all 
the data of interest, while the process layer modifies and evolves 
such data. We keep the structure of both layers to the minimum, in 



particular we do not distinguish between various possible compo- 
nents providing the data, nor those providing the subprocesses run- 
ning concurrently. Indeed the framework can be further detailed 
in several directions, while keeping the results obtained here (cf. 
Section[6]l. 

2.1 Data Layer 

The data layer represents the information of interest in our ap- 
plication. It is constituted by a relational schema 71 equipped with 
equality constraint^] E, e.g., to state keys of relations, and an ini- 
tial database instance lo, which conforms to the relational schema 
and the equality constraints. The values stored in this database be- 
long to a predefined, possibly infinite, set C of constants. These 
constants are interpreted as themselves, blurring the distinction be- 
tween constants and values. We will use the two terms interchange- 
ably. 

Given a database instance T, its active domain ADOM(X) is the 
subset of C such that c € ADOM(I) if and only if c occurs in X. 

A data layer is a tuple X> = (C, TL, £, lo) where: 

• C is a countably infinite set of constants/values. 

• TZ — {Ri, . . . , Rn} is a database schema, constituted by a 
finite set of relation schemas. 

• £" is a finite set {£i, . . . , £m} of equality constraints. Each 
£i has the form 



A 



j=i, 



,fe ^ij — J/iji 



where Qi is a domain independent FO query over 7?. us- 
ing constants from the active domain ADOM(Xo) of Zd and 
whose free variables are x, and Zij and yij are either vari- 
ables in X or constants in ADOM(Xo)0 
• lo is a database instance that represents the initial state of 
the data layer, which confoiTns to the schema TZ and satis- 
fies the constraints £: namely, for each constraint Qi — > 
A =1 k^'-J ~ 2/»i and for each tuple (i.e., substitution 
for the free variables) 9 £ ans{Qi,I), it holds that Zij9 = 

2.2 Process Layer 

The process layer constitutes the progression mechanism for the 
DCDS. We assume that at every time the current instance of the 
data layer can be arbitrarily queried, and can be updated through 
action executions, possibly involving external service calls to get 
new values form the environment. Hence the process layer is com- 
posed of three main notions: actions, which are the atomic pro- 
gression steps for the data layer; external services, which can be 
called during the execution of actions; and processes, which are 
essentially nondeterministic programs that use actions as atomic in- 
structions. While we require the execution of actions to be sequen- 
tial, we do not impose any such constraints on processes, which in 
principle can be formed by several concurrent branches, including 
fork, join, and so on. Concurrency is to be interpreted by interleav- 
ing and hence reduced to nondeterminism, as often done in formal 



Other kinds of constraints can also be included without affecting 
the results reported here (cf. Section|6]. 

"For convenience, and without loss of generality, we assume that 
all constants used inside formulae appear in Zj. 
^^We use the notation t9 (resp., ip9) to denote the term (resp., the 
formula) obtained by applying the substitution ^ to i (resp., ip). Fur- 
thermore, given a FO query Q and a database instance I, the an- 
swer ans {Q,I) to Q over X is the set of assignments 9 from the 
free variables of Q to the domain of X, such that X \= Q9. We treat 
Q9 as a boolean query, and with some abuse of notation, we say 
ans (Q9, X) = true if and only if X ^ Q9. 



verification f4','221. There can be many ways to provide the control 
flow specification for processes. Here we adopt a simple rule- based 
mechanism, but our results can be immediately generalized to any 
process formalism whose processes control flow is finite-state. No- 
tice that this does not imply that the transition system associated to 
a process over the data layer is finite-state as well, since the data 
manipulated in the data layer may grow over time in an unbounded 
way. 

Formally, a process layer V over a data layer T> = {C,TZ,£,Io), 
is a tuple V — {T, A, g) where: 

• J^ is a finite set of functions, each representing the interface 
to an external service. Such services can be called, and as 
a result the function is activated and the answer is produced. 
How the result is actually computed is unknown to the DCDS 
since the services are indeed external. 

• ^ is a finite set of actions, whose execution progresses the 
data layer, and may involve external service calls. 

• £( is a finite set of condition-action rules that form the spec- 
ification of the overall process, which tells at any moment 
which actions can be executed. 

An action a £ A has the form 



a{pi 



: {ei, 



.}, 



where: (i) a{pi, . . . ,pn) is the signature of the action, constituted 
by a name a and a sequence pi , . . . , p„ of input parameters that 
need to be substituted with values for the execution of the action, 
and (ii) {ei, . . . ,em}, also denoted as EFFECT(q), is a set of e/Tecf 
specifications, whose specified effects are assumed to take place 
simultaneously. Each Ci has the form q^ A Q~ -^ Ei, where: 

• it ^ QT i^ ^ query over 7?, whose terms are variables x, 
action parameters, and constants from ADOM(Xo)- The query 
q^ is a UCQ, and the query Q^ is an arbitrary FO formula 
whose free variables are included in those of q^ . Intuitively, 
q^ selects the tuples to instantiate the effect, and Q~ filters 
away some of them. 

• Ei is the effect, i.e., a set of facts for TZ, which includes as 
terms: terms in ADOM(Xo), input parameters, free variables 
of qf , and in addition Skolem terms formed by applying a 
function f £ T to one of the previous kinds of terms. Such 
Skolem terms involving functions represent external service 
calls and are interpreted so as to return a value chosen by an 
external user/environment when executing the action. 

The process g is a finite set of condition-action rules, of the 
form Q i~> a, where a is an action in A and Q is a FO query 
over TZ whose free variables are exactly the parameters of a, and 
whose other terms can be either quantified variables or constants in 

adom(Io)- 

For a detailed example of a DCDS we refer to Appendix lEl 

2.3 Semantics via Transition System 

The semantics of a DCDS is defined in terms of a possibly infi- 
nite transition system whose states are labeled by databases. Such a 
transition system represents all possible computations that the pro- 
cess layer can do on the data layer. A transition system T is a tuple 
of the form (A, 7?,, E, so, db, =>), where: 

• A is a countably infinite set of values; 

• 7?, is a database schema; 

• E is a set of states; 

• So G E is the initial state; 

• dfe is a function that, given a state s € E, returns the database 
associated to s, which is made up of values in A and con- 
forms to TZ; 

• =>CExEisa transition relation between pairs of states. 



iQ)v,v ={s G S I ans{Qv, db{s))} 

($1 A 'S>2)lv =i'^i)v,v n {'i>2)lv 

{3x.<i>)ly ={s G E I 3t.t e A and s e ^^^^ft^^v} 
({-)'^)lv ={« e 2 I 3s'. s ^ s' and s' e (<I>)J,^} 



{i^z.^)ly=f){scj:\{'S>) 



v,VlZ/S] 



C5} 



Figure 1: Semantics of /iC 



In order to precisely build the transition system associated to a 
DCDS, we need to better characterize the behavior of the external 
services, which are called in the effects of actions. This is done in 
Sections|4]and[5] 



3. VERIFICATION 

To specify dynamic properties over a DCDS, we use /x-calculus 
1221 1351 [131 . one of the most powerful temporal logics for which 
model checking has been investigated in the finite-state setting. In- 
deed, such a logic is able to express both linear time logics such as 
LTL and PSL, and branching time logics such as CTL and CTL* 
|;i7|. The main characteristic of /i-calculus is the ability of express- 
ing directly least and greatest fixpoints of (predicate-transformer) 
operators formed using formulae relating the current state to the 
next one. By using such fixpoint constructs one can easily express 
sophisticated properties defined by induction or co-induction. This 
is the reason why virtually all logics used in verification can be 
considered as fragments of /i-calculus. From a technical viewpoint, 
/i-calculus separates local properties, i.e., properties asserted on the 
current state or on states that are immediate successors of the cur- 
rent one, and properties that talk about states that are arbitrarily far 
away from the current one I1I3I . The latter are expressed through 
the use of fixpoints. 

In this work, we use a first-order variant of the /i-calculus 1331 . 
called ^C and defined as follows: 



$ ;:= 



n$ I $1 A $2 I 3a;. $ | (-)$ | Z \ fiZ.^ 



where Q is a possibly open FO query, and Z is a second order 
predicate variable (of arity 0). We make use of the following ab- 
breviations: Va;."I> = ^(3a;.-i<l>), $i V $2 = ^(^"I"! A ^$2), 
[_]$ = ^(-)^$, and uZ.$ = -^fiZ.^^Z/^Z]. 

As usual in /i-calculus, formulae of the form /iZ.$ (and vZ.<^) 
must obey to the syntactic monotonicity of <E> wrt Z, which states 
that every occurrence of the variable Z in ^ must be within the 
scope of an even number of negation symbols. This ensures that the 
least fixpoint /iZ.^ (as well as the greatest fixpoint vZ.^) always 
exists. 

Since fiC also contains formulae with both individual and pred- 
icate free variables, given a transition system T, we introduce an 
individual variable valuation v, i.e., a mapping from individual vari- 
ables X to A, and a predicate variable valuation V , i.e., a mapping 
from predicate variables Z to subsets of E. With these three notions 
in place, we assign meaning to formulae by associating to T, v, and 
V an extension function (■) Jv'' which maps formulae to subsets of 
E. Formally, the extension function (Oif.v is defined inductively as 
shown in Figure [T] 



Example 3.1. An example of /i£ formula is: 

3xi,...,x„. /\xi^Xjh /\ ^iZ.[Stud{xi)y{~)Z\ {\) 

ii'j iS{l,...,n} 

The formula asserts that there are at least n distinct objects/values, 
each of which eventually denotes a student along some execution 
path. Notice that the foimula does not imply that all of these stu- 
dents will be in the same state, nor that they will all occur in a single 
run. It only says that in the entire transition systems there are (at 
least) n distinct students. ■ 

When "I> is a closed formula, ($)jf y depends neither on v nor 
on V, and we denote the extension of "I> simply by ("I>) . We say 
that a closed formula $ holds in a state s £ E if s G (<I>) . In this 
case, we write T, s |= $. We say that a closed formula "I> holds in 
T, denoted by T |= $, if T, so |= "1?, where sq is the initial state 
of T. We call model checking verifying whether T |= $ holds. 

In particular we are interested in formally verifying properties 
of a DCDS. Given the transition system Ts of a DCDS S and a 
dynamic property <1> expressed in /^£0 we say that S verifies $ if 

The challenging point is that T5 is in general-infinite state, so we 
would like to devise a finite-state transition system which is a faith- 
ful abstraction of T5, in the sense that it preserves the truth value of 
all ^C formulae. Unfortunately, this program is doomed right from 
the start if we insist on using full /i£ as the verification formalism. 
Indeed formulae of the form ([T) defeat any kind of finite-state tran- 
sition system. So next we introduce two interesting sublogics of 
/i£ that serve better our objective. 

3.1 History Preserving Mu-Calculus 

The first fragment of ^C that we consider is ^Ca, which is char- 
acterized by the assumption that quantification over individuals is 
restricted to individuals that are present in the current database. To 
enforce such a restriction, we introduce a special predicate LIVE(x), 
which states that x belongs to the cuixent active domain. The logic 
^Ca is defined as follows: 

$ ::= Q I -.$ I $1 A $2 I 3a;.LIVE(a:) A $ | (-)$ | Z \ fiZ.<^ 

We make use of the usual abbreviation, including Va;.LIVE(a::) — >■ 
$ = -i(33::.LIVE(a;)A^$). Formally, the extension function (Oif.v 
is defined inductively as in Figure[T] with the new special predicate 
LIVE(a;) interpreted as follows: 

(LIVE(a;))i'^_v' = {s G S I x/d G v implies d G ADOM(d6(s))} 

Example 3.2. As an example, consider the following (ijCa 
formula: 

i'X.(Va;.LIVE(a;) A Stud{x) -^ 

/iy.(3y.LIVE(y) A GTad{x,y) V {-}Y) A H^), 

which states that, along every path, it is always true, for each stu- 
dent X, that there exists an evolution that eventually leads to a grad- 
uation of the student (with some final mark y). ■ 

We are going to show that under suitable conditions we can get 
a faithful finite abstraction for a DCDS that preserves all foiTnulae 
of hjCa, and hence enables us in principle to use standard model 



We remind the reader that, without loss of generality, we as- 
sume that all constants used inside formulae $ appear in the initial 
database instance of the DCDS. 



checking techniques. Towards this goal, we introduce a notion of 
bisimulation that is suitable for the kind of transition systems we 
consider here. In particular, we have to take into account that the 
two transition systems are over different data domains, and hence 
we have to consider the correspondence between the data in the two 
transition systems and how such data evolve over time. To do so, 
we introduce the following notions. 

Given two domains Ai and A2, a. partial bijection h between 
Ai and A2 is a bijection between a subset of Ai and A2. Given a 
partial function / : 5 — >^ 5', we denote with D0M(/) the domain 
of /, i.e., the set of elements in S on which / is defined, and with 
IM(/) the image of/, i.e., the set of elements s' in S' such that s' = 
/(s) for some s G 5. A partial bijection h' extends h if DOM(/i) C 
DOM(/i') (or equivalently IM(/i) C iM(fe')) and h' {x) = h{x) 
for all X G DOM(/i) (or equivalently h'~^{y) — h~^{y) for all 
y G IM(/i)). Let dbi and db2 be two databases over two domains 
Ai and A2 respectively, both conforming to the same schema 7^. 
We say that a partial bijection h induces an isomorphism between 
dbi and db2 if ADOM(d6i) C DOM{h), AD0M{db2) C iM{h), and 
h projected on ADOM(dbi) is an isomorphism between dbi and 

db2. 

Given two transition systems Ti = (Ai,7?., Ei, soi, dfei, =>i) 
and T2 = (A2, 7?,, E2, S02, 1^62, =>2), and the set H of partial 
bijections between Ai and A2, a history preserx'ing bisimulation 
between Ti and T2 is a relation S C Ei x _ff x E2 such that 
(si, h, S2) G B implies that: 

1. /i is a partial bijection between Ai and A2 that induces an 
isomorphism between dfei(si) and ^62(82); 

2. for each s'l, if a\ =>i s'l then there is an s'2 with S2 =^2 s'2 
and a bijection h' that extends h, such that (s'l, h' , s'2) G B. 

3. for each s'2, if S2 =>2 s'2 then there is an Sj with s\ =>i s'^ 
and a bijection h' that extends h, such that (si, h' , s'2) G B. 

A state si G Ei is history preserving bisimilar to S2 G E2 
wrt a partial bijection h, written si ~h S2, if there exists a 
history preserving bisimulation B between Ti and T2 such that 
(si, h, S2) £ B. A state si G Ei is history preserving bisimilar to 
S2 G E2, written si ~ S2, if there exists a partial bijection h and 
a history preserving bisimulation B between Ti and T2 such that 
(si , /i, S2) G B. A transition system Ti is history preserving bisim- 
ilar to T2, written Ti ~ T2, if there exists a partial bijection ho 
and a history preserving bisimulation B between Ti and T2 such 
that (soi, ho, S02) G B. The next theorem gives us the classical 
invariance result of /i-calculus wrt bisimulation, in our setting. 

Theorem 3.1. Consider two transition systems Ti and T2 
such that Ti ~ T2. Then for every fJ,£,A closed formula $ , we 
have: 

Ti 1= $ if and only if T2 \= $. 

3.2 Persistence Preserving Mu-Calculus 

The second fragment of /iL that we consider is fijCp, which 
further restricts /ijCa by requiring that individuals over which we 
quantify must continuously persist along the system evolution for 
the quantification to take effect. 

With a slight abuse of notation, in the following we write 
LIVE(si, ...,x„) = A»e{i,...,„} LIVE(a;0. 

The logic pCp is defined as follows: 

$ ::= Q I -1$ I $1 A $2 I 3a;.LIVE(3;) A $ | (-)(live(x) A $) | 
H(LIVE(f) A$) I Z\ fiZ.^ 

where Q is a possibly open FO query, Z is a second order predicate 
variable, and the following assumption holds: in (— ) (live(2;) A "l?) 



and [— ] (live(x) a $), the variables x are exactly the free variables 
of "I>, with the proviso that we substitute to each bounded predicate 
variable Z in $ its bounding formula fiZ.^' . We use the usual ab- 
breviations, including: (— )(LIVE(f) —>•$) = -i[— ](LIVE(a;) A 
^<1>) and [-](LIVE(f) —!>$) = -i(-)(LIVE(f) A-i$). Intuitively, 
the use of LIVE() in fijCp ensures that individuals are only consid- 
ered if they persist along the system evolution, while the evaluation 
of a formula with individuals that are not present in the current 
database trivially leads to false or true (depending on the use of 
negation). 

Example 3.3. Getting back to the example above, its variant 
in nLp is 

ivX.(Va::.LIVE(a;) A Stud{x) -^ 
fiY.{3y.LlVE{y) A Grad{x,y) V (-)(LIVE(a;) A Y)) A [-]X) 

which states that, along every path, it is always true, for each stu- 
dent X, that there exists an evolution in which x persists in the 
database until she eventually graduates (with some final mark y). 
Formula 

vX.{\/x.live{x) a Stud{x) — ;> 
IJ,Y.{3y.LlVE{y) A Grad{x,y) V (->(LIVE(a::) -^ Y)) A [-]X) 

instead states that, along every path, it is always true, for each stu- 
dent X, that there exists an evolution in which either x is not per- 
sisted, or becomes eventually graduated (with final mark y). ■ 

The bisimulation relation that captures i-iLp is as follows. Given 
two transition systems Ti = (Ai,7?., Ei, soi, rfbi, =>i) andT2 = 
(A2, 7J, E2, S02, db2, =^2), and the set H of partial bijections be- 
tween Ai and A2, a persistence preserving bisimulation between 
Ti and T2 is a relation S C Ei x i? x E2 such that (si, h, S2) £ B 
implies that: 

1. h is an isomorphism between dfei(si) and dfe2(s2)|j 

2. for each s'l, if si =>i s'l then there exists an 
s'2 with S2 =>2 S2 and a bijection h' that extends 
'i|AD0M(diji(si))nAD0M(d6i(s'j))> such that (sj, ft , s^) £ B\j 

3. for each s'2, if S2 ^2 s'2 then there exists an 
s'l with si =>i s'l and a bijection h' that extends 

''■tADOM(d6i(si))nADOM(d6i(3'j))> such that (si, ft ,82) € B. 

We say that a state Si G Ei is persistence preserving bisimilar to 
S2 G E2 wrt a partial bijection ft, written si ~h S2, if there exists 
a persistence preserving bisimulation B between Ti and T2 such 
that (si, ft, S2) £ B. A state si G Ei 1?, persistence preserving 
bisimilar to S2 G E2, written si ^ S2, if there exists a partial bi- 
jection ft and a persistence preserving bisimulation B between Ti 
and T2 such that (si, ft, S2) G S. A transition system Ti is persis- 
tence preserving bisimilar to T2, written Ti ~ T2, if there exists 
a partial bijection fto and a persistence preserving bisimulation B 
between Ti and T2 such that (soi, fto, S02) G B. The next theo- 
rem shows the invariance of fijCp under this notion of bisimulation. 

Theorem 3.2. Consider two transition systems Ti and T2 
such that Ti ~ T2. Then for every [iLp closed formula $, we 
have: 

Ti 1= $ if and only if T2 \= ^■ 



4. DETERMINISTIC SERVICES 

Now we turn back to the semantics of DCDSs, and analyze them 
under the assumption that external services behave deterministi- 
cally. This means that the evaluation of functions / G J-, represent- 
ing the service interfaces in the process layer, is independent from 
the moment in which the function is called: whenever an external 
service is called twice with the same parameters, it must return the 
same value. So, for example, if the function invocation /(a) re- 
turned 6 at a certain time, then in all successive moments the call 
/(a) will return b again. In particular, stateless services can be 
modeled with deterministic service calls. 

Under this characterization of the services we can now define 
the transition system of a DCDS. We call such a transition system 
"concrete" transition system to avoid confusion with an "abstract" 
transition system that we are going to introduce for our verification 
technique. 

4.1 Semantics 

Let S = {V, V) be a DCDS with data layer V = (C, 7^, £,Io) 
and process layer V = {F, A, g). 

First we focus on what is needed to characterize the states of 
the concrete transition system. One such state obviously needs to 
maintain the current instance of the data layer. This instance is a 
database made up of constants in C, which conforms to the schema 
TZ and satisfies the equality constraints in £. Together with the 
current instance, however, we also need to remember all answers 
we had so far when calling the external services. 

To meet the requirement that service calls behave determinis- 
tically, the states of the transition system keep track of all re- 
sults of the service calls made so far, in the form of equalities 
between Skolem terms involving functions in T and having as 
arguments constants and returned values in C\\ More precisely, 
we define the set of (Skolem terms representing) service calls as 
SC = {/(ui, . . . ,Vn) I //n G J^and {vi, . . . ,v„} C C}, where 
f/n stands for a function / arity n. Then we introduce a service 
call map, which is a partial function A4 : SC — > C. 

Now we are ready to formally define states of the concrete tran- 
sition system. A concrete state, or simply state, is a pair {I,M), 
where I is a relational instance of TZ over C satisfying each equality 
constraint in £, and A^ is a service call map. The initial concrete 
state is (Zd , 0) . 

Next we look at the result of executing an action in a state. For 
this it is convenient to denote the database instance A4 (E) obtained 
by applying a service call map A^ to a set E of facts including only 
constants in C or terms in DOM(A^). Namely, we define M{E) 
as the application of A4 to all the terms appearing in E where 
constants are preserved. Formally, A4{E) — {i?(ci . . . ,c„) | 
R{ti,...,tn) G -Eandc, = U ifU G C and M{ti) = d ifU G 
DOM(X) for i G {1, . . . , n}}. 

Let a be an action in A of the form a{pi,...,pm) '■ 
{ei, . . . , e™,} with d — qf A Q^ -^ Ei. The parameters for 
a are guarded by the condition-action rule Q >-^ a in q. Let a 
be a substitution for the input parameters pi, . . . , pm with values 
taken from C. We say that a is legal for a in state (Z,Ai) if 
(pi, . . . ,Pm)(T G ans{Q,X). 



'Notice that this imphes DOM(ft) — ADOM(rf6i(si)) and 
IM(ft) = ADOM(d62(s2)). 

* Given a set D, we denote by /|d the restriction of / to D, i.e., 
DOM(/|d) = D0M(/) n D, and f\D{x) = f{x) for every x G 
DOM(/) n D. 



'Notice that, we have no knowledge of the specific functions 
adopted by the external services, and we simply assume that such 
functions return some value from C. We are going to have differ- 
ent executions of the system corresponding to each way to assign 
values to the Skolem terms representing the service calls. 



Concrete action execution. To capture what happens when a is 
executed in a state using a substitution a for its parameters, we in- 
troduce a transition relation EXEC5 between states, called concrete 
execution of acy, such that {{X,M),aa, {I',M')) £ EXEC5 if 
the following holds: 

1. (T is a legal parameter assignment for a in state (I, A^), 

2. M' = serviceCalls(J, Q(T, A1), 

3. 1' = M'{do{X, aa)), and 

4. I' satisfies £, 

where DO() and SERVICECalls() are defined as follows. 



DO (I, 



9+ At; 



U U ^'-« 

■.BigEFFECT(a) 0eans{{qf AQ7)cr,X) 



applies the action a to I, using a as the assignment for its param- 
eters. The returned instance is the union of the results of applying 
the effects specifications EFFECT(q), where the result of each ef- 
fect specification q^ A Q~ -^ Ei is, in turn, the set of facts Eia6 
obtained from Eia grounded on all the assignments 9 that satisfy 
the query qf A QJ^ over X. 

SERVICECalls(X, aa, M) = 

MU {th^ PICKValue(C) 1 t occuiTing in DO(X, aa) 

and not in DOM (tW)} 

nondeterministically generates all possible values that can be 
returned by the service calls, guaranteeing that external services 
behave in a deterministic manner. More specifically, all the 
service calls already contained in A4 are maintained, while 
new service calls are nondeterministically bound to an arbitrary 
value PICKValue(C) taken from C (which will be the values 
assumed by such service calls in A4 from now on in the execution). 

Concrete transition system. The concrete transition sys- 
tem T5 for <S is a possibly infinite-state transition system 
(C, 7?,, S, So, db, ^=>) where so = (2^0,0) and db is such that 
db{{I,Ai)) = I. Specifically, we define by simultaneous induc- 
tion S and =^ as the smallest sets satisfying the following prop- 
erties; (i) So £ E; (ii) if {I,M) £ E , then for all substitutions 
a for the input parameters of a and for every (X', M') such that 
{{X,M),aa,{X' ,M')) e EXEC5, wehave (I'.A^') G E and 
{I,M)^{1',M'). 

Intuitively, to define the concrete transition system of the DCDS 
5 we start from the initial state so = (Xo,0), and for each rule 
Q I— >■ a in P, we evaluate Q over lo, and calculate all states s such 
that (so, aa, s) £ EXEC5. Then we repeat the same steps consider- 
ing each s, and so on. The computation of successor states can be 
done by picking all the possible combinations of resulting values 
for the newly introduced service calls, then checking if the suc- 
cessor obtained for a combination satisfies the equality constraints, 
filtering it away if this is not the case. It is worth noting that when 
new service calls are considered, the successors can be countably 
infinite. 

Example 4.1. Let S = (0,P) be a DCDS with data layer 

T) — {C,TZ,£,Tq} and process layer V — {J-,A,g), where 
T = {f/1,9/1}, n = {Q/2,P/l,R/l}, f = 0, Jo = 
{P[a), Q{a, a)}, g — {true h^ a}, A = {a}, and 

a : {Q(a,a)AP(x) ^ {R{x)},P{x) - {P{x),Q{f{x),gix))}} 

The concrete transition system T5 contains infinitely many succes- 
sors connected to the initial state. These successors result from the 
assignment of each possible pair of values to /(a) and g{a) (see 
also Figure [3(a)] ■ 



Example 4.2. Consider a variation of the DCDS described in 
Example 14.11 where the data layer is equipped with an equality 
constraint, i.e., £ = {P{x) A Q{y,z) — > x = y}. The resulting 
concrete transition system has still infinitely many successors of 
the initial state, but the presence of the equality constraint requires 
to keep only those successors in which /(a) returns a (see also 
Figure [2(a)l ■ 



4.2 Run-Bounded Systems 

We now study the verification of DCDSs with deterministic 
services. In particular, we are interested in the following prob- 
lem: given a DCDS 5 and a temporal property $, check whether 
Ts \= $. Not surprisingly, given the expressive power of DCDS 
as a computation model, the verification problem is undecidable for 
all the /i-calculus variants introduced in Section |3] In fact, we can 
show an even stronger undecidability result, for a very small frag- 
ment of propositional linear temporal logic (LTL) [34], namely the 
safety properties of the form Gp where p is propositional. 

Theorem 4.1. There exists a DCDS S with deterministic ser- 
vices, and a propositional LTL safety property $, such that check- 
ing T5 \= <1> is undecidable. 

In the following, we isolate a notable class of DCDS for which 
verification of fj,CA is not only decidable, but can also be reduced 
to standard model checking techniques. 

Consider a transition system T = (A, TZ, E, so, db, =>). A rim 
r in T is a (finite or infinite) sequence of states S0S1S2 ■ ■ ■ rooted 
at So, where Si => Si+i. We use r(i) to denote Si and r[i] to 
represent the finite prefix so ■ ■ ■ s^ of r. A run r = S0S1S2 • • • 
is (data) bounded if the number of values mentioned inside its 
databases is bounded, i.e., there exists a finite bound b such that 
I UsstatcofT AD0M(d6(s))| < b. This is equivalent to saying that, 
for every finite prefix r[i] of r, | IJ7GI0 ^i AD0M(ii6(s_,))| < b. 
We say that T is run-bounded if there exists a bound b such that 
every run in T is (data) bounded by b. A DCDS S is run-bounded 
if its concrete transition system T5 is run-bounded. 

Intuitively, a (data) unbounded run represents an execution of 
the DCDS in which infinitely many distinct values occur because 
infinitely many different service calls are issued. Since we model 
deterministic services whose number is finite, this can only happen 
if some service is repeatedly called with arguments that are the re- 
sult of previous service calls. This means that the values of the run 
indirectly depend on arbitrarily many states in the past. 

Notice that run boundedness does not impose any restriction 
about the branching of the transition system; in particular, Ts is 
typically infinite-branching because new service calls may return 
any possible value. We show that this restriction guarantees decid- 
ability for hCa verification of run-bounded DCDSs with determin- 
istic services. 

Theorem 4.2. Verification of iiC a properties on run-bounded 
DCDSs with deterministic services is decidable. 

We get this result by showing that for run-bounded DCDSs there 
always exists an abstract finite-state transition system that is his- 
tory preserving bisimilar to the concrete one, and hence satisfies 
the same pCa formulae as the concrete transition system. 

Theorem 4.3. For every run-bounded DCDS S with determin- 
istic services, given its concrete transition system T5 there exists 
an (abstract) finite-state transition system Qs such that Qs is his- 
tory preserving bisimilar to T5, i.e., Qs ~ T5. 
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Figure 2: Concrete and abstract transition systems of the DCDS with deterministic services described in Example 14.21 special 
relations that store the service calls results are represented using a call h^ value notation 
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Figure 3: Concrete and abstract transition systems of the DCDS with deterministic services described in Example I4.H special 
relations that store the service calls results are represented using a call h^ value notation 



Let E be the set of states of 9^ and ADOM(9s) = 
Us gs '^'-"-"^(^^(^O)- If Qs is finite-state, then there exists a 
bound b such that |adom(Os)| < b. Consequently, it is possible 
to transform a jj,£a property $ into an equivalent finite proposi- 
tional /i-calculus formula PR0P('1>), where PROP($) is inductively 
defined over the structure of $ as the identity, except for the follow- 
ing case: PR0P(3x.LIVE(x) A *(x)) = Vt,gAooM(5) LIVE(tO A 
PROP(*(ii)). Clearly, 65 |= $ if and only if 6s |= PROP(<l>). 

Theorem 4.4. Verification of ^jlCa properties for run- 
bounded DCDSs with deterministic services can be reduced to con- 
ventional model checking of propositional ^-calculus over a finite 
transition system. 

By the above theorem, and recalling that model checking of 
propositional ^-calculus formulae over finite transition systems is 
decidable |22|, we get Theorem |4.2| 



We conclude the Section by observing that the approach pre- 
sented above for fJi,CA does not extend to full jiC 

Theorem 4.5. There exists a DCDS S for which it is impossi- 
ble to find a faithful finite-state abstraction that satisfies the same 
fiC properties as S. 



The Theorem |4.5| is proved by exhibiting, for every n, a /iC prop- 
erty that requires the existence of at least n objects in the transition 
system. 

Even if this observation does not imply undecidability of model 
checking fiC properties over run-bounded DCDSs, it shows that 
there is no hope of reducing this problem to standard, finite- state 
model checking. 

4.3 Weakly Acyclic DCDSs 

The results presented in Section l42l rely on the hypothesis that 
the DCDS under study is run-bounded, which is a semantic restric- 
tion. A natural question is whether it is possible to check run- 



boundedness of a DCDS. We provide a negative answer to this 
question. 

Theorem 4.6. Checking run-boundedness ofDCDSs with de- 
terministic sen'ices is undecidable. 

To mitigate this issue, we investigate a sufficient syntactic condi- 
tion that can be effectively tested over the process layer of the 
DCDS: if the condition is met, then the DCDS is guaranteed to 
be run-bounded, otherwise nothing can be said. To this end, we re- 
cast the approach of 1 3 1 in the more abstract and expressive frame- 
work here presented. In particular, we first introduce the "posi- 
tive approximate" of a DCDS, which abstracts away some of its 
aspects. We do so for convenience, but we note that the defini- 
tion of weak-acyclicity as well as our results can be stated directly 
over the original DCDS (in fact, we do so in condensed presenta- 
tions of this work). Technically, given a DCDS S = {V, V) with 
data layer T) = iC,TZ,E,Io) and process layer V — {J^,A,q), 
its positive approximate S^ is a DCDS {T>^ ,V^), where V^ = 
(C, TZ, 0, Xo) corresponds to T> without equality constraints, while 
V^ = {T, A^ , Q^) is a process layer whose actions A'^ and pro- 
cess q'^ are obtained as follows: 

• Each condition-action rule Q ^^ am g becomes true i— >■ ct^ 
in p"*". Therefore, g^ is a process that supports the execution 
of every action in A'^ at each step. 

• Each action a(pi, . . . ,p„) : {ei, . . . , e™} in A becomes 
Q^ : {e^ , . . . , eXi} in A^ , where each Ci = qf f\Q~ -^ Ei 
becomes in turn ef — qf -^ Ei. Intuitively, the positive 
approximate action is obtained from the original action by 
removing all the parameters from its signature, and by re- 
moving all "negative" components from the query used to 
instantiate its effect specifications; note that the variables of 
qf that were parameters in a are now free variables in q+. 

The positive approximate fulfils the following key property. 

Lemma 4.1. Given a DCDS S, if its positive approximate S~^ 
is run-bounded, then S is run-bounded as well. 

To derive a sufficient condition for 5^ to be run-bounded, we can 
exploit a strict coiTespondence between the execution of an action 
in V^ and a step in the chase of a set of tuple generating dependen- 
cies (TGDs) in data exchange (2]|23l . In particular, we resort to a 
well-known result in data exchange, namely chase termination for 
weakly acyclic TGDs [23j |J 

In our setting, the weak acyclicity of a process layer is a property 
over a dataflow graph constructed by analyzing the corresponding 
positive approximate process layer. A non-weakly acyclic DCDS 
contains a service that may be repeatedly called, every time using 
fresh values that are directly or indirectly obtained by manipulat- 
ing previous results produced by the same service. This self- 
dependency can potentially lead to an infinite number of calls of 
the same service along an execution of the system, thus making it 
impossible to put a bound on the data used throughout the run (see 
also Example 14. 3t . Weak acyclicity rules out such self dependen- 
cies and is actually a sufficient condition for run-boundedness. 

Given a DCDS S — {V, V) with positive approximate <S^ = 
(I'+,'P+), the dependency graph of V^ is an edge-labeled di- 
rected graph (TV, E) where: (i) N QTZx N^ is a set of nodes such 
that (i?, i) £ N for every R/n € 7?. and every i € {1, . . . , n}; (ii) 
E (Z N X N X {true, false} is a set of labeled edges where 



Notice that using other variants of weak acyclicity is also possible 



• an ordinary edge {{Ri,j}, (7?2, fc), false) £ i? if there exists 
an action a+ e A'^, an effect g+ -^ Ei £ EFFECT(a"'') 
and a variable x such that Ri{. . . , f j-i, x, tj+i, . . .) occurs 
in qf and R2{. . . ,t'k-i,x,t'i^^i, . . .) occurs in Ei; 

• a special edge {{Ri,j), (f?2, fc), true) £ E if there exists 
an action q+ G A'^, an effect q^ -^ Ei £ EFFECT(a"'') 
and a variable x such that Ri{. . . ,tj-i,x,tj+i, . . .) oc- 
curs in qf, i?2(. . . ,ifc_i,i, ifc+i, . . .) occurs in Ei, and 
t = /(..., a;,...), with /G jr. 

V is weakly acyclic if the dependency graph of its approximate V^ 
does not contain any cycle going through a special edge. We say 
that a DCDS is weakly acyclic if its process layer is weakly acyclic 
(e.g., see Figure [5(a)^ . 

Intuitively, ordinary edges represent the possible propagation 
(copy) of a value across states: {{Ri,j), (i?2, fc), false) G E re- 
flects the possibility that the value currently stored inside the j-th 
component of an Ri tuple will be moved to the fc-th component 
of an i?2 tuple in the next state. Contrariwise, special edges repre- 
sent that a value can be taken as parameter of a service call, thus 
contributing to the creation of (possibly new) values across states: 
((^1 1 J) 5 {^2 , k) , true) G E means that the value currently stored 
inside the j-th component of an Ri tuple could be used as param- 
eter for a service call, whose result is then stored inside the fc-th 
component of an R2 tuple. 

A cycle going through a special edge, forbidden by the weak 
acyclicity condition, represents that a service may be repeatedly 
called, every time using fresh values that are indirectly or directly 
obtained by manipulating previous results produced by the same 
service. This self-dependency can potentially lead to an infinite 
number of calls of the same service along an execution of the sys- 
tem, thus making it impossible to put a bound on the data used 
throughout the run. 

Example 4.3. Let 5 = {T>, V) be a DCDS with data layer 
O — {C,TZ,</),Io) and process layer V = {T, A, g), where T = 
{//I}, 7^ = {R/1, g/1}, Xo = {R{a)}, Q = {true ^ a} and 
A = {a}, where a : {R{x) -^ Q{f{x)), Q(x) -^ R{x)}. 

S is not weakly acyclic, due to the mutual dependency between 
R and Q that involves a call to service /. This can be easily seen 
from the dataflow graph (shown in Figure [5(b)) , which contains a 
special edge from {R, 1) to (Q, 1), and a normal edge from {Q, 1) 
to {R, 1). Notice that, in this case, the positive approximate of S 
coincides with S itself. Starting from the initial state, a calls /(a) 
and stores the result inside Q. A second execution of a transfers 
the result of /(a) into R. When a is executed for the third time, 
/ is called again, but using as parameter the previously obtained 
result. Consequently, / may return a new, fresh result, because 
/(/(a)) may be different from /(a). This chain can be repeated 
forever, leading to possibly generate infinitely many distinct values 
along the run. The existence of a run in which a, f{a), f(f{a)), 
/(/(/(a))), . . . , are all distinct values, makes it impossible to ob- 
tain a finite-state abstraction for <S (see Figure [4(b)) . ■ 

Theorem 4.7. Every weakly acyclic DCDS with deterministic 
services is run-bounded. 

Checking weak acyclicity is polynomial in the size of the DCDS. 
Thus it gives us an effective way to verify DCDSs. 

Theorem 4.8. Verification of ^Ca properties for weakly 
acyclic DCDSs with deterministic serx'ices is decidable, and can 
be reduced to model checking of propositional fi-calculus over a 
finite transition system. 
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Figure 4: Concrete and abstract transition systems of tlie run-unbounded DCDS witli deterministic services S described in Exam- 
ple l4.3l; special relations that store the service calls results are represented using a call ^^ value notation 




(a) Weakly acyclic dataflow graph for the DCDSs of Example gJ] (b) Non weakly acyclic dataflow graph for the DCDS ofExample|43 
andl431 



Figure 5: Examples of dataflow graphs for DCDSs with deterministic services; special edges are decorated with * 



Example 4.4. Consider the DCDSs described in Example [411 
and 14.21 They have the same dataflow graph, which is weakly 
acyclic (see Figure [5(a)] This guarantees that they are run-bounded 
and that it is possible to find a faithful finite-state abstraction from 
them. Two such abstractions are respectively shown in Figure [3(b)] 
and [2(b)] ■ 

5. NONDETERMINISTIC SERVICES 

We now consider DCDSs under the assumption that services be- 
have nondeterministically, i.e., two calls of a service with the same 
arguments may return distinct results during the same run. This 
case captures both services that model a truly nondetetministic pro- 
cess (e.g., human operators, random processes), and services that 
model stateful servers. In the remainder of this section, whenever 
we refer to a DCDS, services are implicitly assumed nondetermin- 
istic. 

5.1 Semantics 

As in the case of deterministic services, we define the semantics 
of a DCDS S in terms of a (possibly infinite) transition system T5. 

Let S = {T>, V) be a DCDS with data layer V = (C, 71, f.Io) 
and process layer V — {J-, A, g). A state is simply a relational 
instance of 7?. over C satisfying each constraint in £. We denote the 
initial state with To ■ 

Next, we define the semantics of action application. Let a be an 
action in A of the form a{pi , . . . , p,n) : {ei , . . . , e™} with effects 
ei = ql f\ Q^ -^ Ei. The parameters for a are guarded by the 
condition-action rule Q >-^ a £ g. Let a be a legal substitution for 
the input parameters pi, ■ ■ . ,Pm with values taken from C. 

We reuse the definition of DO (I, aa) from Section[4T| as the in- 
stance obtained by evaluating the effects of a on instance I. Recall 
that DO() generates an instance over values from the domain C but 
also over Skolem terms, which model service calls. For any such 



instance X, we denote with CALLS (X) the set of calls it contains. 
For a given set D C C, we denote with EVALS_d(X, a, a) the set 
of substitutions that replace all service calls in DO(X, a, a) with 
values in D, 



EVALSi3(I, a,(T) = {9 



6^ is a total function 

6 : CALLS(do(J, Q,a)) —> D}. 



Each substitution in EVALSd(X, Q,o") models the simultaneous 
evaluation of all service calls, which replaces the calls with results 
selected nondeterministically from D. In the following, we refer 
to these substitutions as evaluations. 

Concrete action execution. We introduce a transition relation 
N-EXECs between states, called concrete execution of ao9, such 
that {l,a(j6,T) G N-EXEC5 if the following holds: 

1. a is a legal parameter assignment for a in state T, 

2. ^ G EVALSc(X, a, cr), 

3. X' = DO(X, a,a)6', and 

4. X' satisfies the constraints £. 

Notice that, in contrast to the deterministic services case, the 
choice of evaluation 9 is not subject to the requirement that it 
evaluates a service call to the same result across concrete execution 
steps. However, notice that within a concrete execution step, all 
occurrences of the same service call evaluate to the same result 
(modeling the fact that a call with given arguments is invoked only 
once per transition, and the returned result is copied as needed). 

Concrete transition system. The concrete transition system Ts 
for <S is a transition system whose states are labeled by databases. 
More precisely, 

Ts = {C,TZ,'E,so, db,^) where sq = Xo and db is such that 
db{X) = X. Ti and => are defined by simultaneous induction as 



the smallest sets satisfying the following properties: (i) Iq £ T,; 
(ii) if I e E , then for all a, a, 6 and 1' such that {I, aaO, I') e 
N-EXEC5, we have that X' £ E, and I ^ X' . 

5.2 State-Bounded Systems 

We consider the verification problem for DCDS with nondeter- 
ministic services. As in the deterministic case, restrictions on both 
the processes and the properties are required, motivated by the fol- 
lowing undecidability result. 

Theorem 5.1. There exists a DCDS S with nondeterministic 
services, and a propositional LTL safety property $, such that 
checking T5 ^ $ is undecidable. 

State-bounded DCDS. Since we are interested in verifying 
more expressive temporal properties, we need to consider restricted 
classes of DCDS. We observe first that, with nondeterministic ser- 
vices, the run-boundedness restriction of Section l431 is very limit- 
ing on the form of the DCDS, as it boils down to imposing a bound 
on how many times each service may be called with the same ar- 
guments. Observe that this was not the case for deterministic ser- 
vices, where the unlimited same-argument calls are allowed, as they 
all return the same result. We propose a less restrictive alternative. 
We say that DCDS S is state-bounded if there is a finite bound b 
such that for each state X of T5, |adom(X)| < b. Notice that, in 
contrast to the notion of run-boundedness, state-boundedness does 
allow runs in which infinitely many distinct values occur because 
infinitely many service calls are issued. The unboundedly many 
call results are distributed across states of the run, but may not ac- 
cumulate within a single state. The following result shows that we 
also need to restrict the logic, as the one used in the deterministic 
case is too expressive for decidability. 

Theorem 5.2. Verification of (iCa properties on state- 
hounded DCDSs with nondeterministic services is undecidable. 

We therefore restrict the property class to the logic ^Cp C p,Ca 
presented in Section [T2l 

Theorem 5.3. Verification of p,Cp properties by state- 
bounded DCDS with nondeterministic services is decidable. 

5.3 Abstract Transition System 

We relegate the proof of Theorem 15. 3| to Appendix |C. 31 but pro- 
vide the main ideas here. 

Given a DCDS 5, we show that if concrete transition system T5 
is state-bounded, then there is a finite-state abstract transition sys- 
tem O5 whose states and edges are subsets of those in T5, such 
that O5 is persistence-preserving bisimilar to T5 (and hence sat- 
isfies the same pCp properties, by Theorem 13.21 . Since O5 is 
finite-state, the verification of jiCp properties on T5 reduces to 
finite-state model checking on Os, and hence is decidable. 

The existence of B5 follows from the key fact that if two states 
of T5 are isomorphic, then they are persistence-preserving bisim- 
ilar. This implies that one can construct a finitely-branching tran- 
sition system O5 (i.e. with finite number of successors per state), 
such that O5 is persistence-preserving bisimilar to T5, by drop- 
ping sibling states from T5 as follows: instead of listing among 
the successors of s one state for each possible instantiation of the 
service call results, just keep a representative state for each isomor- 
phism type. Since the number of service calls made in each state 
is finite, the number of distinct isomorphism types is finite, so the 
finite branching follows. We call a transition system Qs obtained 
as above a. pruning 0/T5. 



Notice that despite being finitely-branching, any pruning Qs can 
still have infinitely many states, as it may contain infinitely long 
simple run^ r, along which the service calls return in each state 
"fresh" values, i.e., values distinct from all values appearing in the 
predecessors of this state on r. This problem is solved by judi- 
ciously selecting which representatives to keep in O5 for the suc- 
cessors of a state s. Namely, whenever the representatives of a 
given isomorphism type T include states generated exclusively by 
service calls that "recycle" values, select only such states (finitely 
many thereof, of course). By recycled values we mean values ap- 
pearing on a path leading into s. 

If T5 is state-bounded, then the number of service calls per state 
is bounded, and due to the construction's preference for recycling, 
it follows that all simple runs in Qs must have finite length. To- 
gether with the finite branching, this implies finiteness of O5. 

Notice that proving the existence of O5 does not suffice for de- 
cidability, as the proof is non-constructive. We therefore provide 
an algorithm for constructing O5 (Algorithm RCYCL). One of the 
technical problems we need to overcome in developing the algo- 
rithm is that we evidently cannot start from the infinite-state con- 
crete transition system, instead exploring a portion thereof. This 
means that it is not obvious how to decide whether the successors 
of a state are generated by recycling service calls, since these calls 
may recycle from paths that RCYCL hasn't explored yet. There- 
fore, RCYCL may sometimes select non-recycling service calls 
even when a recycling alternative exists. However, we can prove 
that RCYCL constructs what we call an eventually recycling prun- 
ing, which in essence means it may fail to detect recycling service 
calls, but only a bounded number of times. 

We formalize the above discussion in Appendix IC. 31 where we 
prove the following result: 

Theorem 5.4. If input DCDS S is state-bounded, then every 
possible run of Algorithm RCYCL terminates, yielding a finite even- 
tually recycling pruning O5 0/T5, with T5 ~ O5. 



Theorem l5.4l and Theorem 13 .2 I directlv imply Theorem 15. 3 1 
Figures |7] and |6] illustrate two concrete transition systems, and 
possible recycling prunings for them. 

5.4 GR-Acyclic DCDSs 

As with run-boundedness in the deterministic services case, for 
nondeterministic services the state-boundedness restriction is a se- 
mantic property. We investigate whether it can be effectively 
checked. 

Theorem 5.5. Checking state-boundedness of DCDSs is un- 
decidable. 

Consequently we propose a sufficient syntactic restriction. 

Intuitively, for a run to have unbounded states, it must issue 
unboundedly many service calls. Since there are only a bounded 
number of effects in the process layer specification, there must ex- 
ist some service-calling effect that "cyclically generates" fresh val- 
ues (i.e. is invoked infinitely many times during the run). Notice 
that unbounded generation of fresh values is insufficient for state- 
unboundedness: these values must also accumulate in the states. 
But by definition of the DCDS semantics, a transition drops ("for- 
gets") all values that are not explicitly copied ("recalled") into the 
successor. Therefore, to accumulate, a value must be "cyclically re- 
called" througout the run (it must be copied infinitely many times 
from relation to relation). 



'We call a run simple if no state appears more than once in the run. 
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Figure 6: Concrete and abstract transition systems of the state-unbounded DCDS with nondeterministic services of Example 15.21 



GR-acycIicity is stated in terms of a dataflow graph constructed 
by analyzing the process layer. The graph identifies how service 
calls and value recalls can chain. In essence, GR-acyclicity 
requires the absence of a "generate cycle" that feeds into a "recall 
cycle". 

GR-acycliclicity. Let ^ be a set of actions, and A^ its positive 
approximate (Section 14. 3t . We call dataflow graph of A the di- 
rected edge-labeled graph (A'^, E) whose set TV of nodes is the set 
of relation names occurring in A, and in which each edge in E 
is a 4-tuple (_Ri, id, R2, b), where -Ri and R2 are two nodes in 
A^, id is a (unique) edge identifier, and b is a boolean flag used to 
mark special edges. Formally, E is the minimal set satisfing the 
following condition: for each effect e of A'^, each R{ti, . . . ,tm) 
in the body of e, each (5(ii, . . . , t^/) in the head of e, and each 
i€{l,...,m'}: 

• if i; is either an element of ADOM(Xo) or a free variable, then 
{R, id, Q, false) £ E, where id is a fresh edge identifier. 

• if t'i is a service call, then {R, id, Q, true) £ E, where id is 
a fresh edge identifier. 

We say that A is GR-acyclic if there is no path n = ■kiti2Tis in the 
dataflow graph of A, such that tti , 713 are simple cycles and 712 is 
a path containing a special edge that is disjoint from the edges of 
TTi . We say that a process layer V = {F, A, g) is GR-acyclic, if A 
is GR-acyclic. We call a DCDS GR-acyclic if its process layer is 
GR-acyclic. 

Notice that GR-acyclicity is a purely syntactic notion. Moreover, 
it can be checked in PTIME since the dataflow graph has size poly- 
nomial in the size of the process layer specification. 

Theorem 5.6. Any GR-acyclic DCDS is state-bounded. 



We show the proof in Appendix IC .41 but provide some intuition 
here, noting that the dataflow analysis is significantly more subtle 
than suggested above. 

First, note that ordinary edges correspond to an effect copying 
a value from a relation of the cuiTent state to a relation of the suc- 
cessor state. Special edges correspond to feeding a value of the 
current state to a service call and storing the result in a relation of 
the successor state. Note that the cycles tti and 713 allow both kinds 
of edges, reflecting the insight that the size of the state is affected 
in the same way regardless of whether a value is copied to the suc- 
cessor, or it is replaced with a service call result (see Example l5.2l 
and Example 15.31 for illustrations of state-unboundedness arising 
from each case). ttijTts are both "recall cycles": the number of 
values moving around them does not decrease (this is of course a 
conservative statement; reality depends on the semantics of queries 
in the effects, which is abstracted away). Note that 112 contains 
a special edge E, which means that the values moving around tti 
are cyclically fed into the service call / of E. The key insight 
here is that, even if the set of values moving around tti does not 
change (no special edges in tti replace them), and thus the service 
call / sees the same bounded set of distinct arguments over time, it 
can still generate an unbounded number of fresh values because / 
is nondeterministic. 11-1112 constitute the "generate cycle" we men- 
tion above. The generated values are stored in the recall cycle tts, 
where they accumulate and force the size of the relations of tts to 
grow unboundedly. 



Example 5.1. Let us consider again the DCDS S described 
in Example 14.31 this time considering //I as a nondeterministic 
service. The resulting concrete transition system is shown in Fig- 
ure |7(a)| Even if S is not run-bounded, it is state-bounded, because 
in every state its database consists of only one tuple. This is attested 
by the dataflow graph shown in Figure [7(a)] and guarantees the ex- 



istence of a faithful finite-state abstraction. One such finite-state 
abstraction is reported in Figure |7(b)] ■ 

Example 5.2. Let S = {V, V) be a DCDS with data layer 
T) — {C, TZ, 0, lo) and process layer V = {T, A, g), where T = 
{//I}, 7^ = {R/l, Q/1}, To = {R{a)}, Q = {true y-^a},A^ 
{a} and a : {R(x) -^ R{x), R{x) -^ Q(/(x)), Q{x) -^ Qix)}. 

S is not GR-acyclic, because each R tuple is continuously 
copied, and at the same time continuously issues a call to service / 
that is then stored into a Q tuple, which is continuously copied as 
well. This is attested by the dataflow graph of Figure |8(b)) . 

The overall effect caused by the iterated application of a is that 
fresh values are continuously generated and accumulated, making 
S state-unbounded. Consider for example the application of ac- 
tion a in state Iq- It leads to an infinite number of successors, 
each one of the form {R{a),Q{v)} where v is the value returned 
by /(a). Consider now a second application of a in one of these 
states. It again leads to an infinite number of successors, due to 
the nondeterminism of /(a). In particular, each successor has the 
form {R{a),Q{v),Q{v')}, where v' is the result of the second 
call /(a). When v' ^ v, the number of tuples is increased from 
2 to 3. By executing a over and over again, for some successors 
the value returned by a new call f{a) will be distinct from all the 
ones already stored in Q. This causes an indefinite increment of 
the database size due to the continuous insertion of fresh Q tuples. 
Such behavior is clearly shown in the concrete transition system of 
<S, depicted in Figure [6(a)l Figure [6(b)] shows instead one possible 
corresponding abstraction; even if the abstraction approach ensures 
that the generated transition system is finite-branching, some of its 
runs pass through an infinite number of distinct, growing states. ■ 

Example 5.3. Let S = {V, V) be a DCDS with data layer 
V — (C, 7?., 0, Xo) and process layer V = (-F, A, g), where T = 
{//l,ff/l}, n = {R/l}, Xo = {R{a)}, g = [true h> a}, 
A = {a} and a : {Rix) ^ {R{f{x)), R{g{x))}}. 

S is not GR-acyclic, because its dataflow graph, shown in Fig- 
ure |8(c)| contains a unique node R with two distinct special looping 
edges from RtoR itself. Indeed, every time a is executed, each 
R tuple contained in the current database may generate two R tu- 
ples in the next state, such that each such new tuple contains a value 
different from all the other ones. Therefore, even if the newly gener- 
ated values are not accumulated, in the "worst" case the number of 
R tuples is doubled every time a is executed. A sample run of the 
system could be the following. Starting from Xo, « calls /(a) and 
g{a), getting b and c as result and obtaining the state {R{b), R{c)}. 
A second execution of a involves now 4 service calls (/(&), gib), 
f{c), g{c)), which may return 4 different new values, e.g. leading 
to state {R{d),R{e), R{f), R{g)}, and so on. ■ 

GR^-Acyclicity. This relaxation of GR-acyclicity is based on the 
insight that, for a cycle T5 in the dataflow graph to truly preserve 
the number of values moving in it, Ts's edges must not all be si- 
multaneously inactive. We say that an edge is active in a step of 
the run when some action corresponding to it executes. By the 
DCDS semantics, if all edges of T5 are simultaneously inactive, 
then none of the corresponding copy/call operations are executed 
and all relations involved in T5 forget their value in the next state. 
Ts is effectively flushed. 

GR'''-Acyclicity is a relaxation that does allow path n = -ki-K21T3 
as in the definition of GR-Acyclicity, provided that 712 contains an 
edge e that cannot be active at the same time as any of the subse- 
quent edges in ^2^3. 



Semantically this ensures that in order for the generate cycle 
ni-K2 to push fresh values toward recall cycle tts, some action corre- 
sponding to e must execute, and in the meantime all actions main- 
taining the values in cycle tts are disabled, thus flushing 713. 713 
thus receives an unbounded number of waves of fresh values from 
■niii2, but it forgets each wave before the next aiTives. 

Of course, the property of being active at the same time is seman- 
tic in nature, but we give a sufficient syntactic condition. Associate 
with every edge e in the dataflow graph the set actions{e) of ac- 
tions it corresponds to (this set can be computed via simple inspec- 
tion of the process layer). Then edges ei , 62 are not simultaneously 
active if actions{e\) n actions{e2) — 0. 

The DCDSs discussed in Example 15.21 and 15.31 are not GR-- 
acyclic. Indeed, they are not GR-acyclic, and all the edges con- 
tained in their dataflow graphs can be simultaneously active, be- 
cause they all correspond to a single action. 

We observe that GR-acyclicity is not related to weak acyclicity. 
In particular, a DCDS may be GR-acyclic but not weakly acyclic 
(see Example 1 5. It . 

As with any sufficient syntactic condition for an undecidable 
semantic property, an infinite succession of refinements of GR- 
acyclicity is possible, each relaxing the condition to allow more 
DCDS classes. We propose a very powerful relaxation in Ap- 
pendix IC.4I GR^ -acyclicity. Appendix |E] shows a full-fledged 
DCDS example that conforms to GR^ -acyclicity, showing that it 
admits a practically relevant DCDS class. 

Theorem |5.6| and Theorem 15.3 l imply : 

Theorem 5.7. Verification of fiCp properties for GR-- 
acyclic DCDS with nondeterministic services is decidable. 

6. DISCUSSION 

Complexity. Both in the case of weakly acyclic DCDSs with 
deterministic services and of GR^-acyclic DCDSs with non- 
deterministic services, our construction generates a finite transition 
system whose number of states is exponential in the size of 
the DCDS. Let $ be a (iLa or fijCp formula of size i with k 
alternating nested fixpoints. Then, considering the complexity 
or propositional /x-calculus model checking on finite transition 
systems |22|, the complexity of verification of $ over a DCDS of 
size n is 0(2" ■ n')'', hence in ExpTlME. 

Comparison of the two semantics. It is natural to ask how the 

expressivities of the two DCDS flavors compare. Interestingly, we 
can show that for unrestricted DCDSs, the two semantics are equiv- 
alent from the point of view of expressive power, i.e. any DCDS 
with deterministic services can be simulated by a DCDS with non- 
deterministic services, and conversely. However, we show below 
that the two semantics are not equivalent with respect to decidabil- 
ity of verification. 

Consider first the reduction from deterministic to non- 
deterministic services. 

Theorem 6.1. Let D be a DCDS with deterministic services 
and schema TZd- Then one can rewrite D in linear time to a DCDS 
N with nondeterministic services and schema TZn, such that (i) 
TZn includes TZd and (ii) the projection ofTisr to TZd coincides 
with T D, and (Hi) if D is run-bounded, then N is state-bounded. 

We turn next to the converse reduction. 

Theorem 6.2. Let N be a DCDS with nondeterministic ser- 
vices and schema TZn- Then one can rewrite N in linear time to a 
DCDS D with deterministic services and schema TZd, such that (i) 
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Figure 7: Concrete and abstract transition systems obtained when the DCDS described in Example l4.3l has nondeterministic services 
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TZd includes TZn and (ii) the projection of To to TZn coincides 
with Tjv. 

The above reductions show that for unrestricted DCDS, deter- 
ministic and nondeteiTninistic services are equivalent with respect 
to expressive power However, they are not equivalent with respect 
to decidability of verification. This is because state-boundedness 
of the DCDS with nondeterministic services does not imply run- 
boundedness of the rewritten DCDS with deterministic services. 
In fact, one can prove that there exists no reduction from state- 
bounded DCDS with nondeterministic services to run-bounded 
DCDS with deterministic services: recall that for properties from 
^.Ca — fJ'jC.p, verification is decidable for the latter (by Theo- 
rem I4.3t . and undecidable for the former (Theorem 15. 2t . In par- 
ticular, the reduction we use to prove Theorem |6.2l vields a non- 
weakly-acyclic DCDS and is therefore not pertinent to verification 
decidability. 

In contrast, for the converse reduction of Theorem 16.11 observe 
that whenever D is run-bounded, A'^ is state-bounded. Therefore, 
if we restrict the property language to fijCp, decidability of verifi- 
cation for run-bounded DCDSs with deterministic services follows 
as a corollary of the reduction. Recall however that decidability 
holds even for the larger logic p.£,A (by Theorem I4.2t . Our proof 
of Theorem 14.21 exploits the reduction as well, though additional 
technical contribution is needed to handle /ijCa- 



Mixed semantics. The reduction in Theorem 16.11 allows us to 
verify jj,£p properties for DCDSs with a mix of deterministic 
and nondeterministic services, by first rewriting to a DCDS with 
exclusively nondeterministic services (as long as the rewritten 
DCDS is GR-acyclic). We give an example of a DCDS with mixed 
service semantics in Appendix lEl 

Support for arbitrary integrity constraints. We remark that, by 
exploiting the equality constraints, we can extend our decidability 
results to support integrity constraints on the database expressed 
as arbitrary FO sentences under the active domain semantics. First, 



note that the definition of DCDS semantics is independent of the 
type of constraints used, as it simply requires their satisfaction by 
each state of the concrete transition system. Now consider a DCDS 
S with an FO integrity constraint IC defined on its schema. We can 
rewrite S to enforce IC using equality constraints. To this end, we 
add a binary auxiliary relation aux to the schema, initialized in the 
initial state to contain the tuple (a, b) of distinct constants. We add 
to each action an effect that simply copies aux between states, en- 
suring the persistence of fact aux{a, b) throughout the run. Finally, 
we add an equality constraint ec := -i/C A aux{x,y) -^ x = y. 
Notice now that S' will never execute an action that violates IC, 
because that would violate ec. Equality constraints also prove 
instrumental in modeling artifact systems, described next. 

Connection with the artifact model. In terms of expressive capa- 
bilities, our DCDS model is equivalent to a business process model 
known in the literature as the artifact model (see Section|7). While 
variations thereof abound, they are virtually all special cases of the 
following general model. In it, given a relational schema T, an arti- 
fact of type T (or T-artifact) is a tuple of schema T. The attributes 
of the tuple are known as artifact variables, and they must include 
an id attribute that uniquely identifies each artifact. An artifact sys- 
tem has a schema comprising a collection of types {Tijig^i ..._„}, 
and the schema TZdb of an underlying relational database. The in- 
stance of an artifact system consists of a relation Ii for each type Ti 
and a database of schema TZdb- The artifact system also has a col- 
lection of actions (usually called "services", a teiTn we avoid here to 
rule out confusion with the external services of the DCDS model). 
The exection of an action evolves the current instance into its suc- 
cessor Each action has & pre-condition which is a FO sentence over 
the artifact schema, evaluated over the current artifact instance un- 
der the active domain semantics. The pre-condition must hold for 
an action to be eligible to execute. Actions are also equipped with 
a post-condition which is usually an 3FO formula relating the cur- 
rent and the successor instances (if _R is a relation in the schema, 
the post-condition's i?-atoms refer to the current instance, while 



R' atoms refer to the successor). By 3FO we mean existential FO 
logic, in which only existential quantifiers are allowed, and they 
must appear in the scope of an even number of negations. Existen- 
tially quantified variables are not interpreted over the active domain, 
but over the possible infinite domain. They model external inputs 
from the environment the artifact system evolves in. 

While we do not show a formal reduction between the two mod- 
els, we sketch here how a DCDS process can simulate an artifact- 
based one. The DCDS can model the sets U of 7i -artifacts using 
an integrity constraint to enforce the uniqueness of the id attribute. 
The pre-conditions of artifact actions correspond to the conditions 
in the DCDS condition-action rules. Artifact post-conditions ^ can 
be simulated by DCDS effects, after rewriting ^ to Skolem nor- 
mal form and introducing for each resulting Skolem term a nonde- 
terministic service call. The fact that post-conditions can contain 
disjunction while effects are conjunctive and positive is no imped- 
iment: the additional expressivity needed can be transferred to the 
DCDS condition-action rules, if necessary modeling one artifact 
transition step with several DCDS transition steps. 

7. RELATED WORK 

As discussed in Section [6l the unrestricted artifact-centric and 
DCDS models have equivalent expressive capabilities. Our work 
is therefore most closely related to prior work on verification of 
artifact-centric business processes. The difference lies in how each 
work trades off between restricting the class of business processes 
versus the class of properties to verify. 

Artifact-centric processes with no database. Work on formal 
analysis of artifact-based business processes in restricted contexts 
has been reported in 11241 [25] |7]. Properties investigated include 
reachability II24I 1251 , general temporal constraints [25], and the 
existence of complete execution or dead end |7|. For the variants 
considered in each paper, verification is generally undecidable; 
decidability results were obtained only under rather severe re- 
strictions, e.g., restricting all pre-conditions to be "true" |I241, 
restricting to bounded domains Il25l l7l. or restricting the pre- and 
post-conditions to be propositional, and thus not referring to data 
values |25|. 1 15] adopts an artifact model variation with arithmetic 
operations but no database. It proposes a criterion for comparing 
the expressiveness of specifications using the notion of dominance, 
based on the input/output pairs of business processes. Decidability 
relies on restricting runs to bounded length. [37] addresses the 
problem of the existence of a run that satisfies a temporal property, 
for a restricted case with no database and only propositional LTL 
properties. All of these works model no underlying database (and 
hence no integrity constraints). 

Artifact-centric processes with underlying database. More re- 
cently, two lines of work have considered artifact-centric processes 
that also model an underlying relational database. One considers 
branching time, one only linear time. 

Branching time. Our approach stems from a line of research that 
has started with [ 161 and continued with JS) and ||5] in the context 
of artifact-centric processes. The connection between evolution of 
data-centric dynamic systems and data exchange that we exploit in 
this paper was first devised in 1 16]. There the dynamic system tran- 
sition relation itself is described in terms of TGDs mapping the cur- 
rent state to the next, and the evolution of the system is essentially 
a form of chase. Under suitable weak acyclicity conditions such a 
chase terminates, thus making the DCDS transition system finite. 
A first-order /i-calculus without first-order quantification across 
states is used as the verification formalism for which decidability 



is shown. Notice the role of getting new objects/values from the 
external environment, played here by service calls, is played there 
by nulls. These ideas where further developed in ||3], where TGDs 
where replaced by action rules with the same syntax as here. Se- 
mantically however the dynamic system formalism there is deeply 
different: what we call here service calls are treated there as unin- 
terpreted Skolem terms. This results in an ad-hoc interpretation of 
equality which sees every Skolem term as equal only to itself (as 
in the case of nulls fl61). The same first-order /i-calculus without 
first-order quantification across states of [16] is used as the verifica- 
tion formalism, and a form of weak acyclicity is used as a sufficient 
condition for getting finite-state transition systems and decidability. 

In the case of deterministic services, our framework is directly 
inspired by [3|, though here we do interpret service calls. This 
decision is motivated by our goal of modeling real-life external 
services, for which two distinct service calls may very well return 
equal results, even under the deterministic semantics (for instance 
if the same service is called with different arguments, or if distinct 
services are invoked). Interpreting service calls raises a major chal- 
lenge: even under the run-bounded restriction, the concrete transi- 
tion system is infinite, because it is infinitely branching, (a service 
call can be interpreted with any of the constants from the infinite 
domain). In contrast to [3], what we show in this case is not that 
the concrete transition system is finite (it never is), but that it is 
bisimilar to a finite abstract transition system. This leads to a proof 
technique that is interesting in its own right, being based on novel 
notions of bisimilarity for the considered /i-calculus variants. The 
reason standard bisimilarity is insufficient is that our logics /i£p 
and ^Ca allow first-order quantification across states, so bisimilar- 
ity must respect the connection between values appearing both in 
the cmxent and successor state. Our decision to include first-order 
quantification across states was motivated by the need to express 
liveness properties that refer to the same data at various points in 
time (e.g. "if student x is enrolled now and continues to be enrolled 
in the future, then x will eventually graduate"). 

Inspired by (3], JS) builds a similar framework where actions 
are specified via pre- and post-conditions given as FO formulae 
interpreted over active domains. The verification logic considered 
is a first-order variant of CTL with no quantification across states. 
Thus, it inherits the limitations discussed above on expressibility of 
liveness properties. In addition, the limited temporal expressivity 
of CTL precludes expressing certain desirable properties such 
as fairness. [5] shows that under the assumption that each state 
has a bounded active domain, one can construct an abstract finite 
transition system that can be checked instead of the original 
concrete transition system, which is infinite-state in general. The 
approach is similar to the one we developed independently for 
nondeterministic services, however without quantification across 
states, standard bisimilarity suffices. As opposed to our work, the 
decidability of checking state-boundedness is not investigated in 
(5), and no sufficient syntactic conditions are proposed. 
Linear time. Publication [21 1 considers an artifact model that has 
the same expressive capabilities as an unrestricted class of DCDS 
in which the infinite domain is equipped with a dense linear order, 
which can be mentioned in pre-, post-conditions, and properties. 
Runs can receive unbounded external input from an infinite 
domain, and this input corresponds to nondeterministic services in 
a DCDS. Verification is decidable even if the input accumulates 
in states, and runs are neither run-bounded, nor state-bounded. 
However, this expressive power requires restrictions that render 
the result incomparable to ours. First, the property language is 
a first-order extension of LTL, and it is shown that extension 
to branching time (CTL*) leads to undecidability. Second, the 
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come as a pleasant surprise, and the two ^£ variants studied liere, 
paired with the respective DCDS classes, strike a fortuitous balance 
between expressivity and verification feasibility. 



' The result is even stronger: it holds for propositional LTL. 

^ Decidability cannot be estabhshed via a faithful finite-state absti'action. 

' Decidabihty is obtained via reduction to finite-state model checking. 



Table 1: Summary of our (un)decidability results. 



formulae in pre-, post-conditions and properties access read-only 
and read-write database relations differently, querying the latter 
only in limited fashion. In essence, data can be arbitrarily accu- 
mulated in read-write relations, but these can be queried only by 
checking that they contain a given tuple of constants. It is shown 
that this restriction is tight, as even the ability to check emptiness 
of a read-write relation leads to undecidability. In addition, no 
integrity constraints are supported as it is shown that allowing 
a single functional dependency leads to undecidability. II9I 
disallows read-write relations entirely (only the artifact variables 
are writable), but this allows the extension of the decidability 
result to integrity constraints expressed as embedded dependencies 
with terminating chase, and to any decidable arithmetic. Again 
the result is incomparable to ours, as our modeling needs include 
read-write relations and their unrestricted querying. 

Infinite-state systems. DCDSs are a particular case of infinite- 
state systems. Research on automatic verification of infinite-state 
systems has also focused on extending classical model checking 
techniques (e.g., see 1 14 1 for a survey). However, in much of this 
work the emphasis is on studying recursive control rather than data, 
which is either ignored or finitely abstracted. More recent work 
has been focusing specifically on data as a source of infinity. This 
includes augmenting recursive procedures with integer parameters 
HOI , rewriting systems with data |9 1, Petri nets with data associated 
to tokens |28|, automata and logics over infinite alphabets 11211111 
[31] [20(^7^, 8 , 9|, and temporal logics manipulating data |20|. How- 
ever, the restricted use of data and the particular properties verified 
have limited applicability to the business process setting we target 
with the DCDS model. 

8. CONCLUSIONS 

We summarize our results in Table[T](arrows denote implications 
between results). We note that exhibiting a finite faithful abstrac- 
tion of a concrete transition system is more than a means towards 
showing decidability, being a desirable goal in its own right as the 
most promising avenue towards practical implementation. Notice 
that we list as open the verification of ^C properties on bounded- 
run DCDSs with deterministic services, but recall from Section l4~2l 
that in this case there exists no faithful finite-state abstract transi- 
tion system. 

We believe that DCDSs provide a natural and expressive model 
for business processes powered by an underlying database, and thus 
are an ideal vehicle for foundational research with potential to trans- 
fer to alternative models. 

Note that the design space for FO extensions of propositional /i- 
calculus is broad, and notoriously contains bounded-state settings 
for which satisfiability of even modest extensions of propositional 
LTL is highly undecidable (e.g. LTL with the freeze quantifier over 
infinite data words ||20J ). In light of this, our decidability results 
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APPENDIX 

A. VERIFICATION 



In this appendix we give the bisimulation invariance results for 
^Ca and ^Cp. 

A.l History Preserving Mu- Calculus 

We prove history preserving bisimulation invariance for (iCa- 
We adopt a two-step approach. We first prove the result for the logic 
Ca, obtained from jiCa dropping the predicate variables and the 
fixpoint constructs. Such a logic corresponds to a first-order variant 
of the Hennessy Milner logic; note that the semantics of this logic is 
completely independent from the second-order valuation. We then 
extend the result to the whole jiCa by dealing with fixpoints. 

Lemma A.l. Consider two transition systems Ti — (Ai,7^, 
Si,soi,rf6i,=>i) and T2 = {A2,7?,, S2, S02, (^62, =>2), a partial 
bijection h between Ai and A2 and two states si £ Ei and S2 € 
S2 such that si ~h S2. Then for every (open) formula $ of Ca, 
and every valuations V\ and V2 that assign to each of its free vari- 
ables a value d\ G ADOM{db\{si)) and d2 £ ADOM(d62(s2)), 
such that d2 = h{di), we have that 

Ti,si 1= $«! if and only if T 2, S2 \= $W2. 

Proof. We proceed by induction on the structure of $. 

Local first-order queries (base case) Consider $ = Q, where 
Q is an (open) FO query. Since h induces an isomor- 
phism between db{si) and db{s2), for every valuations 
«i and V2 that assign to each free variable of Q a value 
di G ADOM{dbi{si)) and ^2 £ ADOM{db2{s2)), such 
that d2 — h{di), we have that ans{Qvi, db{si)) = 
ans{Qv2, db{.32)). 

Negation By induction hypothesis, for every (open) formula $ 
and every valuations vi and «2 that assign to each of its 
free variables a value di £ ADOM{dbi{si)) and d2 G 
ADOM(rf62(s2)), such that ^2 — h{di), we have that 
Ti,si \= $vi if and only if T2, S2 |= $W2. By definition, 
Ti,si \= (-i$)ui if andonly if Ti, si ^ $ui , and, by induc- 
tion hypothesis, Ti , si ^ $ui if and only if T2, S2 ^ $W2, 
which corresponds to T2, S2 |= (-i$)w2. 

Conjunction By induction hypothesis, for every (open) formula 
$ and every valuations vi and V2 that assign to each of 
its free variables a value di G AD0M(d6i(si)) and d2 G 
ADOM(d62(s2)), such that d2 — h{di), we have that 
Ti, si \= ^iVi if and only if T2, S2 |= ^iV2, withi G {1, 2}. 
Hence, Ti,si |= <I>itii and Ti,si \= $2^1 if and only 
if T2,S2 \= <I'it'2 and T2,S2 |= "l?2t'2. By definition, 
we therefore have Ti,si \= ($1 A $2)t'i if and only if 

T2,S2 1= ($1 A $2)^2. 

Modal operator Consider two states si G Ei and S2 G E2 such 
that si «h S2- By definition, given a valuation vi that as- 
signs to each free variable of $ a value di G ADOM(d6i(si)), 
we have that Ti, si |= ((— )$)ui if there exists a transition 
si =>i s'l such that Ti, s'l |= ^vi. Since si ~h s 2, there 
exists a transition S2 =>2 S2 such that Sj ~ft/ s'2, where 
h' extends h. By induction hypothesis, for every valuation 
V2 that assigns to each free variable a; of $ a value d2 G 
ADOM(d62(s2)), such that d2 = /i'(di) with x/di G vi, we 
have that Ti, si |= $vi if and only if T2, S2 1= ^'^2. Since 
h' is an extension of h, and vi assigns to each free variable 
of $ a value di G ADOM(d6i(si)) C dom(/i), we observe 
that for every pair of assignments x/di G vi and a;/d2 G V2, 



it holds that d2 = h'{di) = h{d\). Furthermore, since h in- 
duces an isomorphism between db\{s\) and ^62(82), for each 
assignment x/d2 G V2, we have that d2 £ ADOM{dh2{s2))- 
Considering that S2 =>2 s'2, by definition we therefore get 

The other direction can be proven in a symmetric way. 

Quantification Consider two states Si G Ei and S2 G S2 such 
that s\ ~h S2- By definition, given a formula $ and a 
valuation v'l that assigns to each free variable of $ a value 

di e DOM(/i), we have that Ti, si |= (3a;.LIVE(2;) A $)v'i 
if and only if there exists d £ ADOM(rf6i(si)) such that 
TijSi 1= 3>«i, where vi = iii[a;/d]. By induction hypoth- 
esis, for every valuation 112 that assigns to each free variable 
2/ of $ a value d2 G ADOM(d62(s2)), such that ^2 = h{di) 
with y/di G 1)1, we have that Ti, si |= <l?«i if and only 
if T2,S2 1= $1)2- More specifically, the structure of V2 is 
^2 = v'2[x/d'], where d' — h{d) G ADOM(d62(s2)) because 
h induces an isomorphism between dfei(si) and (^62(82). 
Hence, we get T2, S2 |= (3x.LIVE(a::) A $)«2- 

The other direction can be proven in a symmetric way. D 

Proof of Theorem I3.1I We prove the theorem in two steps. 
First, we show that Lemma lA. 1 1 can be extended to the infinitary 
version of La that supports arbitrary countable disjunction. Then, 
we recall that fixpoints can be translated into this infinitary logic, 
thus guaranteeing invariance for the whole j^La logic. 

Let <I/ be a countable ordered set of open £a formulae. Given a 
transition system T = (A, 7?., E, so, dfc, =>), the semantics of V VE" 
is (V ^)„ = U^gif (^)u • Therefore, given a state s of T and a 
variable valuation v that assigns to each free variable of ^ a value 
d G ADOM{db{s)) , we have T, s |= 'i'v if and only if T, s \= ipv 
for some ?/) G 'P. Arbitrary countable conjunction is obtained for 
free because of negation. 

We show that the invariance result proven in Lemma |AT| triv- 
ially extends to this arbitrary countable disjunction. Lemma lA. II 
guarantees that invariance is preserved for any finite disjunction. 
Formally, let {$1, . . . , $„} be a finite set of open Ca formulae. 
Consider two states si G Ei and S2 G E2 such that s\ ~h 82- 
Then, for every valuations v\ and V2 that assign to each free vari- 
able of {$1, . . . ,$„} a value d\ G ADOM(d6i(si)) and ^2 G 
ADOM(d62(s2)), such that ^2 = h{di), we have that Ti,si \= 
(V.e{i,...,„} ^0^1 if and only if T2, S2 h (V,G{i,...,n} ^0^2- 

Now consider two valuations v'l and v'2 that assign to each 
free variable of V 'I' a value di G ADOM(d6i(si)) and ^2 G 
ADOM{dbi{s2)), such that d2 — h{di). By definition, Ti,si \= 
(V '^)v'i if and only if there exists i/ife G ^ such that Ti, si \= 
tpkv'i. The proof of invariance for the infinitary Ca logic is then ob- 
tained by observing that Ti, si 1= (V ^)wi if andonly if Ti, si \= 
(V,6{i,...,fe}V'i)"i if andonly if T2,S2 f= (Vig{i,...,fe} V'04 if 

andonlyifT2,S2 N (V*)»^2- 

In order to extend the result to the whole /ijCa, we resort to 
the well-known result stating that fixpoints of the /i-calculus can 
be translated into the infinitaiy Hennessy Milner logic by iterating 
over approximants, where the approximant of index a is denoted 
by jj,°'Z.$ (iy°'Z.^). This is a standard result that also holds for 
jj,£a- In particular, approximants are built as follows: 



where A is a limit ordinal, and where fixpoints and their approxi- 
mants are connected by the following properties: given a transition 
system T and a state s of T 

• s G (/i^.$)„ y if and only if there exists an ordinal a such 
that s G (^"Z."I')Jf v' and, for every /3 < a, it holds that 

s (/z.$)J,v-; 

• s ^ {vZ.^)^v if and only if there exists an ordinal a such 
that s ^ {v°'Z.^)Jy and, for every /3 < a, it holds that 

se{vf'z.^)'^^y. u 
A.2 Persistence Preserving Mu-Calculus 

We prove persistence preserving bisimulation invariance for 
^Cp. To prove the invariance result, we adopt a two-step approach. 
We first prove the result for the logic £p, obtained from ^Cp drop- 
ping the predicate variables and the fixpoint constructs. Such a 
logic corresponds to a first-order variant of the Hennessy Milner 
logic; note that the semantics of this logic is completely indepen- 
dent from the second-order valuation. We then extend the result to 
the whole ^Cp by dealing with fixpoints. 

Lemma A.2. Consider two transition systems Ti = 
{Ai,7?., Ei,soi, d6i,=>i) and T2 = (A2,7?., E2, S02, d62,=>2), 
a partial bijection h between Ai and A2 and two states si G Ei 
and 32 G E2 such that si ^^ 82. Then for every {open) 
formula $ of Lp, and every valuations vi and V2 that assign 
to each of its free variables a value di G ADOM(d6i(si)) and 
d2 G ADOM(d62(s2)), such that d2 = h(di), we have that 

Ti,si ^ <l?iii if and only if T 2 , 82 |= ^V2- 

Proof. We proceed by induction on the structure of $. In par- 
ticular, we discuss the two base cases of (— )(LIVE(a;) A $) and 
[— ](LIVE(a::) A $') with one variable. For convenience, we rewrite 
the latter case to (— )(LIVE(a;) — >• $), where $ — -itl?'. The other 
cases are derived, or proven in the same way as done for Lemma 
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Modal operator (conjunction) Consider two states si G Ei and 
82 G E2 such that 8\ ~h 82- Let x be the only free variable 
of $, and x/d a valuation such that d G ADOM(dbi(si)). 
Then, by definition we have that Ti, s\ \= (( — )(LIVE(a;) A 
"I')[a;/d] if there exists a transition s\ =^i s'l such that 
d G ADOM(d6i(si)) and Ti, s'l |= $[x/d]. Since si ~h S2, 
there exists a transition S2 =>2 S2 such that s'l ~;j' S2, 
where h' is compatible with h. By induction hypothesis and 
by considering that h' is an isomorpshim between db\{s'i) 
and ^62(82), we have that Ti, si |= ^[x/d] if and only if 
h'{d) G ADOM(d62(s2)) and T2,S2 |= ^x/h'{d)\. Now 
we observe that d G ADOM(d6i(si)) n ADOM(d6i(si)) and 
h' is an extension of /i|ADOM(d6i(si))nADOM(d6i(s;))- This im- 
plies that h'{d) = h[d) G ADOM(d62(s2)), because h is 
an isomorphism between d6i(si) and d62(s2). Consider- 
ing that S2 =>2 S2, by definition we therefore get T2, S2 \= 
((->(LIVE(^)A<l>))[x/ft(d)]. 
The other direction can be proven in a symmetric way. 

Modal operator (implication) Consider two states si G Ei and 
S2 G E2 such that si ~h S2. Let x be the only free variable 
of $, and x/d a valuation such that d G ADOM(dbi(si)). 
Then, by definition we have that Ti, si |= ((— )(LIVE(a::) -^ 
$)[a;/d] if there exists a transition si =>i s'l such that d 
ADOM(d6i(si)) or Ti,si ^ <l>[2;/d]. Since si ^h s 2, there 
exists a transition S2 =^2 S2 such that si ~;j/ S2, where 

Now we 



h! is an extension of h\ 



AD0M{d6i(si))nAD0M(dfoi (s^)) 



discuss the two cases in which d ^ ADOM{dbi{s'i)) and d G 
ADOM(rf6i(si)). 

• Assume that d ADOM(rf6i(s'i)). Since 
si ^h S2, we have that h(d) G ADOM{db2(s2))- 
Now, towards contradiction, let us assume that 
h{d) e ADOM(rf62(s2)). Hence, we have 
h{d) e ADOM{db2{s2)) H ADOM{db2{s'2)). Observe 
that h' is an extension of ft|ADOM(d6i(si))nADOM(d6i(s;)). 
which is equivalent to state that h'~^ is an extension 
of 'i~^lADOM(d62(s2))nADOM(d62(4))- ^Ws ifflplies that 
h~^{d) — h'~^{d) — d. Since h' is an isomorphism be- 
tween dbi(s'i) and ^62(82), then d G ADOM(d6i(s'i)), 
and this contradicts the hypothesis. 

• Assume that d G ADOM{dbi{s'i)). Then we can pro- 
ceed following the line of reasoning used for the case of 
(-)(live(2:) A$). 

The other direction can be proven in a symmetric way. D 
Proof of Theorem |3.2| The proof is analogous to that of 
Theorem l3.1l but now using Lemma |AJ2] D 

B. DETERMINISTIC SERVICES 
B.2 Run-Bounded Systems 



noext 
s.a,p.b,- 



Proof of Theorem I4.1I The proof is by reduction from the 
halting problem. Given a deterministic Turing Machine TM, we 
define DCDS <S with deterministic services and prepositional safety 
property $, such that TM halts if and only if T5 |= $. 

Intuitively, every run of T5 simulates a run of TM. Each state s 
of Ts models a configuration of TM. A transition in T5 models a 
transition in TM. We give the construction next. 

The DCDS. To model a configuration of TM in a relation of 
the DCDS state, we model the visited tape segment as a graph 
whose nodes are cell identifiers, and whose edges form a linear 
path. The edge relation is called right, with the intended meaning 
that right {x,y) declares cell y to be the right neighbor of cell x 
on the tape. We also introduce a relation sym, with sym{c, s) in- 
tended to model that cell c holds symbol s. Unary relation head 
models the head position: head{c) means that the head points to 
cell c. Finally, unary relation state keeps the state of TM, and 
a boolean predicate halted is meant to detect that TM has halted. 
In summary, the data layer D = {C,TZ,£,Io) of S contains the 
schema 7?. = {right/2, sym/2, head/1, halted/0}. We detail £ 
and To after sketching the process layer 

There is a single action a, in charge of simulating the transitions 
of TM. It has no parameters, and its guard is always true: true i-)- a. 
a contains the following effects. 

ecopy simply copies the part of the tape that stays unchanged in 
the transition because the head doesn't point to it: 

Bcopy ■ right{X, Y) A right{Y, Z)f\ 

sym(X, SX) A symij, SY) A sym{Z, SZ)N 
^{head{X) A head{Y) A head{Z)) 

{right{X, Y), nght{Y, Z), 
sym{X, SX), sym{Y, SY), sym[Z, SZ)} 

In addition, we add effects for each entry of TM's transition rela- 
tion 5. 

For instance, if (p, b, — >■) G 5[s, a) (i.e. 5 prescribes that in state 
s, if the head points to a cell containing symbol a, TM changes state 
to p, the cell's symbol is overwritten with b, and the head moves to 
the right), we introduce two effects. One for the case when the tape 
needs no extension to the right. 



right{X, Y) A sym{X, a) A sym{Y, SY) A SF / cjA 
head{X) A state{s) 

{right {X, Y), sym(X, b), sym{Y, SY), 
head{Y), state{p)}. 



and one when it does: 



ext 
s.a,p.b.- 



right{X, Y) A sym{X, a) A sym(Y, oj)A 
head{X) A state{s) 

{nght(X, Y), right (Y, newCell{Y)) , 
sym{X, b), sym{Y, 1.), sym[newCell(Y) ,uj) 
head{Y), state{p)}. 



To distinguish among constants and variables in the above effect 
specifications, we use capital letters for the latter and lower-case 
letters for the former Notice that the extension is performed by 
calling service newCell, which is meant to return a fresh cell id 
(we show below how to ensure this). Also notice the use of special 
symbol uj, which is reserved for labeling the end of the tape seg- 
ment. Finally, special symbol _L is by convention used to initialize 
the tape prior to starting the run. 

We are not quite done, as we still need to ensure that right in- 
duces a linear order on the collection of cell identifiers generated 
during the run. Notice that this cannot be achieved exclusively 
by declaring FO constraints in right, as linear orders are not FO- 
axiomatizable. The solution must exploit the interplay between 
constraints on right and the way T5 transitions. 

Observe that, by definition of the effects that extend right (e.g. 
above), at each step the cuiTent right end of the tape seg- 



ext 
s,a,p,b,- 



ment obtains at most one new successor However, if the call of 
service newCell returns a cell id that already appears in the tape 
segment, then there can be some cell with several predecessors ac- 
cording to right. We rule out this case by declaring the second 
component of right to be a key. It follows that right must be ei- 
ther (i) a linear path (possibly starting from a source node that has a 
self-loop), or (ii) it must contain a simple cycle involving more than 
one cell id. The simple cycle is created at the step when newCell 
returns the id of the leftmost cell. 

We wish to force case (i). To rule out case (ii), we proceed as 
follows: we initialize right to contain a source node that can 
never be a cell id because it cannot be returned by newCell without 
violating the key constraint on right. To this end, we initialize lo 
to 

. right^^ ^{{Q,G), (0,1), {1,2)}, 

. S2/m^«={(l,$),(2,a;)}, 

• ftead^o = {2}, 

• state^° = {so}, 

• halted^" = {}, 

where sq is the initial state of TM. 

Notice that, if we disregard cell 0, To contains the representation 
of an empty tape (symbol $ labels the left end, symbol ui the right 
end). Also notice that right^° has type (i). An easy induction 
shows that every run prefix must also construct a right relation of 
type (i), since any attempt to extend right with an edge back to one 
of its existing nodes violates the key constraint. 

Because symbol $ denotes the left end of the tape, it also follows 
easily from the behavior of TM that during the run, the head will 
never reach the special cell 0, so head can only take values from 
the suffix of right starting at cell 1, which is a true linear path. 

Now assume without loss of generality that the TM is normal- 
ized to enter a particular sink state h when it halts. We add effect 



Ch, which detects the haUing state and sets the boolean predicate 
halted. 

Observe that for the cases when the head stays in place or moves 
left, no tape extension is required, so each such entry in the transi- 
tion relation corresponds to a single effect. 

The property. We define the propositional safety property $ as 

$ : G^halted. 

It is easy to see that the runs of T5 correspond ono-to-one to the 
runs of TM. Since $ is a linear-time property, this run correspon- 
dence suffices to guarantee that T5 |= $ if and only if TM does 
not halt. D 



Proof of Theorem |4.2| The proof is directly obtained from 
Theorem 14.41 noticing that model checking of propositional 
/x-calculus formulae over finite transition systems is decidable 
(22). D 

Proof of Theorem 14.31 In view of proving this result, we first 
introduce a key lemma. We say that a transition system is adom- 
inflationary if the active domain of every state is included in its 
successor's active domain. We say that a DCDS S is adom- 
inflationary if T5 is adom-inflationary. We can show that, for 
adom-inflationary transition systems, persistence-preserving bisim- 
ilarity coincides with history-preserving bisimilarity. 

Lemma B.l. Consider two adom-inflationary DCDSs with 
non-deterministic services, <Si , 82- Tlien Tsi ~ Tg^ if and only if 



Proof of Lemma IB.1I A comparison the two notions of 
bisimilarity reveals that the difference is in the local condition, as 
follows. 

Notice first that what both bisimilarity notions have in common 
is that they mention bisimilar states si and S2 and witness isomor- 
phism h, and their successors s\ ^=> si, si =^ s'2 such that s'l 
and s'2 are bisimilar as witnessed by isomorphism h' . The key dif- 
ference lies in how h' and h are related. In the history-preserving 
flavor, h' must extend h, while in the persistence-preserving flavor 
h' must only extend h |ADOM(si)nADOM{a'i)- 

Clearly, history-preserving bisimilarity implies persistence- 
preserving bisimilarity. However, notice that if the transition 
systems are adom-inflationary, then the converse also holds. 
Indeed, assume si ~h S2- By definition, h' extends 



h 



ADOM(si)nADOM(s'^ ) 



But because of adom-inflation, ADOM(si) C 



ADOM(si) and hence ADOM(si)nADOM(si) = ADOM(si), yield- 



ing 



h 



ADOM(si)nADOM(s^)" 



h. Hence, h' extends h, which is the 



condition for history-preserving bisimilarity. D 



Proof of Theorem I4.3I We prove the result by exploiting 
the reduction postulated by Theorem l6.1l 

Starting from run-bounded DCDS D with deterministic ser- 
vices, the reduction gives us state-bounded DCDS A'^ with non- 
deterministic services. Moreover, the two transition systems have 
the same domains, and the projection of T]v on the schema of D 
coincides with Td (Theorem lG.U ii)). In more detail, denoting the 
schema of D with TZd and the schema of A'^ with TZn, there is a 
bijection /3 between the states of Td and the states of Tat, such 
that s = P{s) \tid- Clearly, this implies that T_d and Tat satisfy 
the same /i£ formulae. 

However, a weaker statement suffices for our purpose. By def- 
inition of history-preserving bisimilarity. Theorem |6.U ii) implies 
that 



We recall that on the way to proving Theorem 15. 3 1 it is shown in 
Theorem l5.4l that since A'^ is state-bounded, we can construct using 
algorithm RCYCL a finite-state abstract transition system F such 
that T]v ~ Tf (F is an eventually recycling pruning of Tjv). 

An inspection of the reduction in Theorem 16. 1 I reveals that Tn 
is adom-inflationary. But since Tat ~ Tf, it follows that Tf 
is adom-inflationary as well (by the local condition of persistence- 
preserving bisimilarity). Thus Lemma [BTI applies, yielding 



(2) Tiv ~ Tf. 

By (1) and (2), and by transitivity of ; 
Tf. D 



we obtain that T_d 



Proof of Theorem I4.4I Theorem |43] implies that, given a 
DCDS S, there exists a finite-state transition system Qs = 
{V,TZ,T,a,SQ, dba,^^a) that is history preserving bisimilar to 
the concrete transition system T5 — (U, 7?., E, so, db, =^)- Thus, 
it is possible to use Os in place of T5 for verification. In partic- 
ular, given a hjCa property $, the verification problem is reduced 
1065 \= <&. LetADOM(e5) = \J^.^^ ADOM{db{si)). If 05 is 
finite-state, then there exists abound b such that |adom(05)| < b. 
Consequently, it is possible to transform $ into an equivalent ^niVe 
propositional /x-calculus formula PROP(<I>) as follows: 

PR0P((5) = Q 
prop(-i*) = ^PROP(v£') 
PR0P(*i A *2) = PROP(»l>i) A PR0P(*2) 
PR0P((-)*) = (-)PROP(*) 

PROP(Z) = Z 
PROP(^Z.*) = /iZ.PROP(*) 

PR0P(32:.LIVE(a;) A *(a;)) = \J LIVE(ii) A PROP(>I'(ti)) 

tjeADOM(5) 

Clearly, 65 ^ $ if and only if 8s \= PROP(<l>). The proof is then 
obtained by observing that verification of ^-calculus formulae over 
finite transition systems is decidable [22 1. D 



Proof of Theorem I4.5I The Theorem is proved by exhibit- 
ing, for every n, a /ijC property that requires the existence of at 
least n objects in the transition system. 

Let S = {T>, V) be a DCDS with data layer V = (C, 71, 0,Io) 
and process layer V = {J-,A,g), where T = {//I}, TZ = 
{R/1,Q/1}, lo = {Ria)}, Q = {-R(a;) ^ a[x)} and A = 
{a(p)}, where a{jp) : {true -^ {Q(/(p))}}- The concrete transi- 
tion system Ts has the following shape: 

• The initial state is so = ({-R(a)}, 0); 

• So is connected to infinitely many successor states, each 
one storing into Q a distinct value d resulting from the ser- 
vice call /(a); each such state has then the form Sd = 

({0(d)},{/W->rf}>; 

• each Sd has no outgoing edge, because there is no applicable 
action in Sd- 

S is clearly run-bounded, in particular by a bound 6 = 3. 

Let us now consider the following fj,C property without fixpoints: 



$„ = 3X1,. . . ,Xn. f\'. 



^x,A l\ (-)Q(x,) 



ie{i, 



^} 



The property states that there are n distinct values, each of which 
is stored into relation Q in one of the successors of the initial state. 
It is easy to see that T5 |= $„ for every n. On the other hand, 
for every finite state abstraction Qs with k successors of the initial 
state, we have that 6s ^ $fc+i. D 



B.3 Weakly Acyclic DCDSs 



2. By definition, for each state Sk 



iMt,ii 



Proof of Theorem I4.6I The proof is by reduction from the 
halting problem. We reuse without change the reduction in the 
proof of Theorem 14.11 This reduction yields for any Turing Ma- 
chine TM a DCDS with deterministic services 5, such that S simu- 
lates TM's computation. That is, the runs of TM correspond one-to- 
one to the runs of T5. It follows immediately that TM halts if and 
only if <S is run-bounded. D 



Proof of Lemma|47T] Let <S = {V, V) be a DCDS with 
data layer © — {C,TZ,£,Xo) and process layer P — {T,A,q). 
Consider now T5 = {C,TZ,T,,so, db,^^) and T5+ — 
{C, TZ, E"*^, So, dh, ^=^^). Since S^ is weakly acyclic by hypothe- 
sis, to prove that run boundedness of T s+ implies run boundedness 
of T5, we show the following stronger result: for every run r in 
T5, there exists a run r^ in Ts^ such that, for all pairs of states 
T{i) = (T„A4,)andr+(i) = (J+, X + >, we have 

1. M^ extends Mi, 

2. Ir CX+; 

3. for the mappings mentioned in A^J^ but not in Mi, Mf 
"agrees" with the maps contained in the suffix of r [i], i.e.. 



Mt\c, = ( 



U-^^) 



Ci 



where d = DOM{Mf) D Uj>, DOM{Mj). 

We prove this by induction on the length of r: 

(base case) The initial state of both runs is r(0) — t~^{0) = 
(lo, 0), and therefore all the three conditions are trivially sat- 
isfied. 

(inductive step) Consider a pair of corresponding states r(i) and 
r^(i), with i > 0. By definition, r(j) ^=> r(i + 1) means 
that there exists an action a G A and a substitution a for 
the parameters of a such that {T{i),aa, T{i + 1)) G EXECs. 
We first observe that q+ can be executed in T^(i), since V^ 
does not impose any restriction on the executability of actions. 
LetiVea;i+ = {s+ G E+ | {T+{i),a,s+) G EXEC^+jbe 
the set of successor states of r+ (i) that are obtained from the 
application of a^ . 

We now show that there exists s G Next^ that satisfies the 
three claims above. The proof is then obtained by simply im- 
posing T^{i + 1) ~ s_. 

1. By definition, DOM(A^i+i) = DOM{Mi) U 
CALLS(DO(Xi,acr)), and, for every s^ = 
{Ml, it) e Next+, we have D0M(X + ) = 
DOM{Mf) U CALLS(do(J+, a+o-)). Consider each 
effect specification g+ A Qj ^ Ej G EFFECT(a). By 
definition of g^ and Q~ , 9 G ans{{q't A Q't)a,Xi) 
implies 9 G ans{qfa,Ii), which in turn im- 
plies 9 G ans[qta,Xf), because Xi C Xf 
by induction hypothesis. Consequently, we 

have DO(Xi,a(T) C do(Zj^, Q^cr), and hence 
CALLS (do (Ii,a(T)) C calls(do(I+, a+cr)). Since 
YiOM{Mi) C dom(A^ + ) by induction hypothesis, then 
we obtain TiOM{Mi+i) C dom(A^^). Since 5+ has 
no equality constraint, the states in Nexf^ cover every 
possible result obtained by calling the service call in 
M^ \ Mf , including those states for which M'^ is, 
an extension of Mi+i- We use Next ^ to denote such 
states. 



Next , we have that M'^ extends Mi+i. Therefore, 
since DO (Ii,acr) C DO(X,^,a"'"CT), we have X^+i = 
M,+i{DO{X„aa)) CX+ = M + {T,0{X+,a+a)). 
3. Since <S^ has no equality constraints, we observe that the 
states in Next^ cover all possible values for the service 
calls that are not mentioned in Mi+i- Therefore, there 
must exist at least one state s G Nexf ^ that satisfies the 
third claim. In other words, by imposing t^ (i -[- 1) = s^, 
we have 



X++i|c.. 



( U M^)\c.+i □ 
i>i+i 

Proof of Theorem|4J1 Let S = {V, V) be a DCDS with 
data layer "D — {C,TZ,£,Xo) and process layer V — {J-,A, g). 
We consider the positive approximate 5+, showing that if the the 
dependency graph G — {N, E) of S (which corresponds by defini- 
tion to the one of 5"*") is weakly acyclic, then iS"*" is run-bounded. 
The complete proof is then directly obtain by appealing to Lemma 
14.11 which states that if iS^ is run-bounded, then <S^ is run-bounded 
as well. 

To prove that weak acyclicity of S implies that S^ is run- 
bounded, we exploit the connection with the chase of a set of tuple 
generating dependencies (TGDs) in data exchange. In particular, 
we resort to the proof given in |23l . Theorem 3.9. For every node 
p £ N, we consider an incoming path to be any (finite or infinite) 
path ending in p. For simplicity, we say that a value appears in po- 
sition p = {Rk ,j) G A*' if it appears in the j-th component of an 
Rk tuple. We define the rank of p, denoted rank (p), as the maxi- 
mum number of special edges on any such incoming path. Since 
5^ is weakly acyclic by hypothesis, G does not contain cycles go- 
ing through special edges, and therefore rank(p) is finite. Let r 
be the maximum among rank (pi) over all nodes. We observe that 
r < \N\; indeed no path can lead to the same node twice using 
special edges, otherwise G would contain a cycle going through 
special edges, thus breaking the weak acyclicity hypothesis. No- 
tice also that | A''| is a constant value, because it is obtained from 7?., 
which is fixed. We now partition the nodes in A'^ according to their 
rank, obtaining a set of sets { A'^o , A^i , . . . , Nr } , where A^i is the set 
of all nodes with rank i. The proof is then a natural consequence of 
the following claim: 

Claim. Consider a trace r in Tg+. For every i G 
{1, . . . , r}, the total number of distinct values occurring 
in the databases of r inside position p £ Ni is bounded 
by a polynomial Pi(|ADOM(Xo)|)- 

We prove the claim by induction on i: 

(Base case) Consider p G A'o. By definition, p has no incoming 
path containing special edges. Therefore, no new values are 
stored in p along the run: p can just store values that are part 
of the initial database Xq. This holds for all nodes in TVo, and 
hence we can fix Po(!adom(Io)|) = |adom(Io)|- 

(Inductive step) Consider p G Ni, with i G {1, . . . , r}. The first 
kind of values that may be stored inside p are those values that 
were stored inside the component itself in lo. The number of 
such values is at most |adom(Io)|. In addition, a value may 
be stored in p for two reasons: either it is copied from some 
other position p' G Nj with i ^ j, or it is generated by means 
of a service call. 

We first determine how many fresh values can be generated 
by service calls. The possibility of generating and storing 
a new value in p as a result of an action is reflected by 



the presence of special edges. By definition, any special 
edge entering p must start from a node p G A'o U . . . U 
Ni-i. By induction hypothesis, the number of distinct val- 
ues that can exist in p' is bounded by _H'(|ADOM(Xo)i) = 
X],g{o i-i) ^i(l'*^DOM(Xo)|)- Let ba be the maximum 
number of special edges that enter a position, over all posi- 
tions in the schema; ba bounds the arity taken by service calls 
in J^. Then for every choice of ba values in A^o U . . . U A'i-i 
(one for each special edge that can enter a position) and for 
every action in A'^, the number of new values generated at 
position p is bounded by i/ ■ H{nf''^, where i/ is the to- 
tal number of facts mentioned in the effects of actions that 
belong to A^ . Notice that this number does not depend on 
the data in Xq. By considering all positions in Ni, the total 
number of values that can be generated is then bounded by 
G(1adom(Xo)|) = \N^\ -t/ ■H(|adom(Xo)|)''°- Obviously, 
G(-) is a polynomial, because tf and ba are values extracted 
from the schema TZ of the DCDS, which is fixed. 

We count next the number of distinct values that can be copied 
to positions of Ni from positions of Nj , with j ^ i. A copy is 
represented in the graph as a normal edge going from a node 
in Nj to a node in Ni, with j 7^ i. We observe first that 
such normal edges can start only from nodes in A^o U . . . U 
Ni-i, that is, they cannot start from nodes in Nj with j > 
i. We prove this by contradiction. Assume that there exists 
(p',p, false) e E, such thatp G A''^ andp' G Nj with j > i. 
In this case, the rank of p would be j > i, which contradicts 
the fact that p G Af; . As a consequence, the number of distinct 
values that can be copied to positions in Ni is bounded by the 
total number of values in A'^o U . . . U A'^i- 1 , which corresponds 
to //(|adom(Io)|) from our previous consideration. Putting 
it all together, we define Pi(|ADOM(Io)|) = |adom(Io)| + 
G(|adom(Xo)|) + -ff(|ADOM(Xo)|)- -Pj(-) is a polynomial, 
and therefore the claim is proven. 

In the above claim, i is bounded by the maximum rank r, which is 
a constant. Hence, there exists a fixed polynomial P(-) such that 
the number of distinct values that can exist in the active domains 
of the run r is bounded by P(|adom(Io)|). Technically, given 
T5+ — (C, TZ, E, So, db, ^=>), we have: 



[J db{s)\ < P(|adom(Xo) 



which attests that r is (data) bounded, and consequently that S is 
run-bounded. D 



C. NONDETERMINISTIC SERVICES 
C.2 State-bounded Systems 

Proof of Theorem I5.1I We reuse the proof of Theorem BTI 
Recall that the reduction in this proof constructs for every Turing 
Machine TM a DCDS with deterministic services S that simluates 
the computation of TM. It also constructs a propositional safety 
property <I> such that T5 |= $ if and only if TM halts. 

What we need here is a reduction to a DCDS with nondetermin- 
istic services. However, we recall from the proof of Theorem 14.1 
that the only service in the process layer, service newCell, is guar- 
anteed to be called only with distinct arguments across distinct tran- 
sitions, and so its behavior is unaffected by the choice of determin- 
istic versus nondeterministic semantics. Therefore, the reduction 
applies unchanged to DCDS with nondeterministic services. D 



Proof of Theorem I5.2I We prove a stronger result, namely 
for linear-time /iLa sentences. Such sentences can be written using 
LTL syntax. 

We reduce from the problem of satisfiability of LTL with freeze 
quantifier over infinite data words, known to be highly undecidable 
(E}-hard) l20l . 

Infinite data words (20). Let E be a finite alphabet of labels and 

D an infinite set of data values. An infinite data word w = {wi} 
is an infinite sequence over E x D, i.e., each Wi is of the form 
(ai, di) with Ui G E and di G D. 

LTL with freeze quantifier (LTL^). This logic operates over 
infinite data words, seen as runs. It extends propositional LTL with 
a finite number of registers, which can record the data value at the 
cuiTent step of the run (position in the data word), and recall it at 
subsequent steps. The operation of recording the data value at the 
current position into register i is denoted with \,i. ti denotes the 
boolean comparison of the data value at the current position with 
the value stored in register i. 

As an example, consider the LTL~^ sentence 



ipex =h X{G(a 



tl)) 



over alphabet {a, b}, which states that the data value assigned 
to each label a at positions greater than one is different from 
the data value at the first position of the data word. Notice 
that the data value at the first position is recorded in register 1 
by operation 4,1 , and it is compared to subsequent data values by ti ■ 

The DCDS construction. Given a finite alphabet E = 
{<^i}ie{i,...,n}, we build a DCDS 5 — (DeiPe) with nondeter- 
ministic services, such that each run of T5 represents an infinite 
data word over E. In particular, each state in the run holds the label 
and data value for a single position in the data word. Moreover, 
given an LTL"^ sentence (p over E, we construct a (iCa formula $, 
such Ts 1= <I> if and only if (p is unsatisfiable . 

The idea is to model the registers with existentially quantified 
variables, which ^Ca allows us to introduce at any given point in 
the run and use subsequently, even if in between their binding does 
not persist in the run. 

More precisely, we define the data layer Vs of S as "Ds = 
{C, K, 0,Xo), where C = E U {0}, 7^ = {label/1, datum/1}, 
and la = 0. Intuitively, LABEL stores the label and DATUM 
the data value. We then define the process layer Vs of S as 
Pe = (.P, .4e, ^e), where: 
•-^={//0}. 

• For each 1 G {1, . . . , n}, qs contains an action Oi with no 
parameters and no guard (true >->■ ai). 

• Each Qi G At, contains a single effect et, which creates the 
position of a data word corresponding to label (Ji G E: 

a : true ^ LABEL(cri) A DATUM(/()) 

The service call /() is used to get an arbitrary data value from the 
domain during the action execution. It is nondeterministic, and will 
therefore return possibly distinct values across the run. 

Since actions are always executable, at each step of a run all of 
them qualify, and one is nondeterministically chosen. In this way, 
the collection of all runs coiTesponds to all possible infinite data 
words. Observe that S is state-bounded, as each state contains just 
one label and one data value. 



The property. We now define the property. For simplicity of 
presentation, we sliow it using an LTL-based syntax (branching is 
irrelevant here), though it is clearly expressible in jiCa- 
We obtain tp' from tp by: 

1. replacing each freeze quantifier J-n with 3a;,j.DATUM(a;„), 
and 

2. replacing each occurrence of tn with DATUM(x„), and 

3. replacing each proposition a £ E with LABEL(o-). 

Now let $ :— -k/?'. 

We illustrate the rewrite on property ip^x above, obtaining 



Ve 



3a::iDATUM(xi) AX G(LABEL(a) 



nDATUM(xi)). 



It is easy to see that p is unsatisfiable over infinite data words 
using alphabet E if and only if T5 |= $. 

As a result, jiCa verification by state-bounded DCDSs with non- 
deterministic services is undecidable. D 

Proof of TheoremIQ] See SectionlC3l D 
C.3 Abstract Transition System 



We formalize the discussion from Section 15.31 Since DCDSs 
with nondeterministic services are modeled by means of transition 
systems whose states are constituted by database instances, with a 
slight abuse of notation we will directly use the state to refer to its 
database instance. 

Equality commitments. Consider a set D comprised of constants 
and of Skolem terms built by applying a Skolem function to 
constant arguments. An equality commitment "H on D is a partition 
of D, i.e. a set of disjoint subsets of D, called cells, such that the 
union of the cells in T-L is D. Moreover, each cell contains at most 
one constant (but arbitrarily many Skolem terms). For any e G D, 
[e]K denotes the cell e belongs to. The intention of the partition is 
to model equality and non-equality commitments on the members 
of D as follows: for every 61,62 G D, 61 = 62 if and only if 
[ei]w = [e2]'H. 

Service call evaluations that respect equality commitments. It 

is convenient to view the concrete transition system T5 in the fol- 
lowing equivalent formulation, which emphasizes equality commit- 
ments on the service calls: successor states are built by picking an 
equality commitment T-L, and then picking a service call evaluation 
that respects T-L. More specifically, 

• for each state I, 

• for each action a, 

• for each parameter choice a, and 

• for each equality commitment T-L involving the service 
calls in CALLS(do(I, a, a)) and the values in ADOM(I) U 
AD0M(J()), 

T5 contains possibly infinitely many successor states Xnext, each 
obtained from DO (X, a, cr) by picking a service call evaluation that 
respects H. We say that evaluation 9 respects T-L if for every two 
terms ii,t2 G CALLS(do(X, a, a)) U ADOM(I) U ADOM(Io), we 
have [ii]-H = \t2\n if and only if ti9 = 126. 

Given T, a, a and T-L, we denote the set of all legal evaluations 
with 

EVALs'"(J, a, a) := {6 | 6* G EVALSc(I, Q, o"), 6 respects H, 
DO{l,a,a)e \=£}. 



Notice that we consider legal only those evaluations that respect 
the equality commitment Ti and that, conforming to the semantics 
of the concrete transition system, generate successors which satisfy 
the constraints £. Finally, notice that Ti determines an isomor- 
phism type, as all successors of I generated by the evaluations in 
EVALS^(I, a, a) are isomorphic to each other. 

Prunings. We observe that for each state X of the concrete transi- 
tion system T5, the number of possible choices of a, a and H are 
finite. The sole reason for infinite branching in T5 are the infinitely 
many distinct evaluations that respect Ti, whenever Ti states that at 
least one service call result is distinct from ADOM(X)UADOM(Xo): 
in that case, the service call can be substituted with any value in 
C \ (adom(X) U ADOM(2:o))- 

In contrast, we obtain a finitely-branching transition system if 
instead of keeping the successors generated by all evaluations in 
EVALS^ {I, a, a, T-L), we keep the successors generated by & finite 
subset of these evaluations (if EVALS^ (X, a, a) is non-empty, we 
pick a non-empty subset, to ensure that if Ti is represented among 
the successors of I in T5, it is also represented among the succes- 
sors of I in &s}- We call any transition system obtained in this way 
a pruning of T5, and we denote with PRUNINGS(T5) the set of all 
such prunings. By construction, every pruning of Ts is finitely 
branching. 

Formally, let <S be a DCDS and T5 its concrete transition system, 
with states Ec and initial state lo- A pruning ofTs is the restric- 
tion of T5 to a subset of states Ep C Ec, where Ep satisfies the 
following properties: 

(i) Xo e Ep, and 

(ii) for each X G Ec and each equality commitment Ti, if H 
is represented by some successor of I in T5, it is also rep- 
resented by a successor of / in Qs. We say that T-L is rep- 
resented by successor T' of T if there exist a, a and 8 G 
EVALS^(J, a,o-) such that {T,aaO,X') G N-EXEC5. 

(iii) for each X G Ec, the number of successors of X that are also 
in Ep is finite. 

Clearly, a concrete transition system T5 admits (potentially 
infinitely) many prunings, but we show next that they all are 
persistence-preserving bisimilar to T5 (and therefore to each other, 
due to transitivity of the ~ relation): 

Lemma C.l. For every concrete transition system T5 and 
pruning 65 G PRUNINGS(T5), we have that 9s ~ T5. 

The result follows from the fact that state isomorphism implies 
persistence-preserving bisimilarity. In the following, we denote 
with si-^hs' the fact that h is an isomorphism from state s to s' . 



Lemma C.2. Consider a concrete transition system Ts with 
initial state so and one of its prunings Qs- Let sc be a state 0/T5 
and sp a state ofQs- If there exists function h such that h fixes 
ADOM(so) and Spi-^hSc, then Sp ^h sc- 



Proof of Lemma ICH Let Ts = (C,7^, Ec,so, rf6,=4>c> 
and Os — {C, TZ, Ep, so, db, ^=>p). The proof follows from the 
following claim: 



Claim 1. Given sc G Ec and sp G Ep, if spi->hSc 
and h is the identity on ADOM(Xo), then for each s'c 
such that Sc ^=>c s'c there exist s'p and h' such that 
(i) Sp ^4>p s'p; 

(ii) h' is an extension of h IadomCspjoadomCs'^); 
(iii) fe' is the identity on ADOM(Xo); 



(iv) s'pt^u's'c- 
Indeed, this claim allows us to exhibit the bisimilarity relation 

R = {{x, i,y)\x eT,p,y € Ec,xi~^iy}. 

R is a. bisimilarity relation because it satisfies the forth condition 
in the definition of persistence-preserving bisimilarity by Claim 1 . 
It trivially satisfies the back condition because P is constructed 
by picking a subset of the states of T5. Since by construction 
{sp, h, So) G R, we have sp ~;i sc- 

To prove Claim 1, we observe that the successor s'q of sc is gen- 
erated by a particular choice of the action a (with condition-action 
rule Q !-)■ a), the parameter instantiation ac (such that sc \= 
Que), the equality commitment He on CALLS(do(sc ct, o"c)) U 
ADOM(sc') U ADOM(Io), and the service call evaluation 9c G 
EVALS^'^(sc,a,crc): s'c = DO{sc,a,crc)9c- We show how 
to construct ap, T-Lp and 9p £ EVALS^^ (sp, q,(tp) such that 
Sp \= Qap and s'p — DO(sp, a, crp)6p satisfies the claim. 

We let ap — h~^{ac), observing that since Q is a first-order 
query, it is preserved under isomorphism, so sc \= Que implies 
Sp \= Qap. Thus, ap is a legal parameter instantiation. 

To construct Tip, Op, we first show that sc = DO(so,a,ao) 
and Sp — DO(sp, a, ap) are isomorphic, as witnessed by the func- 
tion h : ADOM(sp) — !> ADOM(sc) defined as follows: 

h ■— {c n> /i(c) I c e ADOM(sp) U ADOM(Xo)} 
U {f{mp, ... , rUn) n> f{h{mp), ..., ft(m„)) | 
f{mp, . . . ,m„) G CALLS(sp)}. 

From the definition of h and the fact that the service calls are gen- 
erated by queries preserved under isomorphism, it follows immedi- 
ately that spi-^f^sc. It is easy to see that h is also an isomorphism 
between CALLS(do(sc, a, ac)) U ADOM(sc) U ADOM(Io) and 
CALLS(do(sp,q, crp)) U ADOM(sp) U ADOM(Io), (i-e. h pre- 
serves the structure of Skolem terms), and therefore between the 
sets of corresponding equality commitments. 

We therefore pick Up — h^^{Hc)- By construction of P, each 
equality type is represented among a state's successors in P, i.e. 
there exists 6p that respects Up, and there exists s'p G Ep, such 
that s'p = spOp. 

The existence of legal choices for a, ap,Hp and 9p proves item 
(i) of Claim 1, namely that sp =^p s'p. 

To prove the remaining items, we exhibit h' defined as follows: 

h'{t) ~h{t)ec, 

for some choice of i such that iOp — t. 

To see why h' is well-defined, observe that, by construction of 
the successor states in Ts, for each t G ADOM(sp) there must 
exist i G ADOM(sp) such that 9p evaluates i to i (tOp = t). More- 
over, observe that if there are distinct i,u £ ADOM(sp) such that 
t = iOp — u6p, it does not matter which one we pick in the defi- 
nition of h' , since h{t)dc = h{u)Oc. This is because 9p respects 
Tip, and therefore [^It^j, — [w]-Hp. Since Hpi-^jJ-Lc, it follows 
that \fi{t)]'H.c ~ [h{u)]-Hc' ''nd since Oc respects Tic we have 
that h{t)ec = h{u)ec. 

Items (ii), (iii) and (iv) of Claim 1 follow by similar reasoning 
from the fact that service call evaluations respect the equality com- 
mitments, which are isomorphic. D 

Proof of Lemma IcTTI This is a corollary of Lemma lQ2l 
Indeed, by definition, 65 ~ T5 holds if and only if the initial 

state Sq of 65 is bisimilar to the initial state sf of T5, i.e. there 

exists isomorphism h such that Sq ^h Sq . 

By definition, a concrete transition system shares the initial state 



witnesses isomorphism: so'n-idSo ■ By Lemma [02] we have 



s5 ~id So . 



D 



Eventually Recycling Prunings. While all prunings of a concrete 
transition system are finitely-branching, they are not guaranteed to 
be finite. The reason is that they don't necessarily rule out infinitely 
long simple runs r, along which the service calls return in each state 
I "fresh" values, i.e. values distinct from all values appearing in I 
and its predecessors on r. Towards addressing this problem, we fo- 
cus on prunings in which the evaluations are not chosen arbitrarily. 
Given a finite run r ending in state I of T5, an action 
a, a parameter choice a and an equality commitment H on 
CALLS(do(I, a, a)), we say that evaluation 6 G EVALS^(I, a, a) 
recycles from r if each value in the range of occurs in r. We say 
that pruning O5 is eventually recycling if every (finite or infinite) 
path r in 65 contains only finitely many states generated by non- 
recycling evaluations. Formally, if t = sqSi ■ ■ ■ and the service 
call evaluation used in Si ^=^c Sj+i is denoted as 6i, then there 
are only finitely many indexes j such that 8j does not recycle from 
rlj]. 

Lemma C.3. Let T5 be a concrete transition system. 

(i) All eventually recycling prunings 0/T5 are finite. 

(ii) If T s is state-bounded, then it has at least one eventually 
recycling pruning. 



with all its prunings, so 



The identity mapping id 



Proof of Lemma [C31 (i): All eventually recycling prunings 
are finite. 

Let Qs be an eventually recycling pruning of concrete transition 
system Ts- By virtue of being a pruning, O5 is finitely branch- 
ing. We show next that every simple path in Qs has finite length, 
which together with finite branching implies finiteness by Konig's 
Lemma. 

Towards a contradiction, assume that there exists infinite simple 
run r in O5. Since O5 is eventually recycling, there is a finite 
prefix of r such that all values occurring in r occur also in this 
prefix. Therefore, r contains only finitely many distinct values, and 
hence only finitely many distinct states (databases of given schema 
over these values). If r has infinite length, then a pigeonhole 
argument contradicts the assumption that r is simple. 

(ii): If Ts is state-bounded, then it has an eventually recycling 
pruning. 

Let Qs be a pruning obtained from T5 by picking the finite 
subset of evaluations SE C EVALS^(s,a, a) as follows: if there 
is a run r in Qs from so to s such that EVALs'^(s, a, a) includes 
at least one evaluation that recycles from r, then SE contains 
exclusively recycling evaluations (i.e. for each evaluation G SE, 
there is a run r from sq to s in Qs such that 6 recycles from r). 
Otherwise, SE is an arbitrary finite subset of EVALs'^(s, a, a)). 

We prove that pruning Qs is eventually recycling. By defini- 
tion, if T5 is state-bounded then |adom(s)| < h for each state 
s, where h is the size bound on the state. Assume towards a con- 
tradiction that Qs contains a run r = S0S1S2 ■ ■ ■ that includes 
infinitely many states generated by evaluations that do not recycle 
from r. It follows that there must exist a finite fc > such that 
I Uf=o'^'-'°'^(^»)l > 3b and such that ADOM(sfe+i) contains at 
least one fresh value, i.e. ADOM(sft-i-i) — |Ji=o '^^'-"-"^('^O 7^ ®- 
Let 9k+i G EVALs'^(sfc, a, a) be the service call evaluation that 
generates s^+i- Clearly &t;+i does not recycle from r, since it 



contains at least one fresh value in its range. However observe 
that, since the fc-length prefix of r contains at least 36 distinct 
values, this prefix contains at least b values that are distinct from 
the values in ADOM(Xo) U ADOM(sa;) (since by state-boundedness, 
|adom(Io) U ADOM(sfc)| < 2b). Call the set of these values V. 

Also by state-boundedness, 6k+i introduces at most b fresh val- 
ues. Any one of the values in V can be used instead of the fresh 
values introduced by 9k+i, to obtain another evaluation S^+i that 
respects H. Hence 6r witnesses an evaluation in EVALS (sfc, a, a) 
that does recycle from r. But this contradicts the definition of Os, 
which mandates that 6k+ihe dropped in favor of S^_|_j^. D 

This result implies that if Ts is state-bounded, then there exists 
a finite-state abstract transition system 65 that is persistence- 
preserving bisimilar to T5. Indeed, any eventually recycling 
pruning of T5 can play the role of O5 (it is finite by Lemma lC3l i), 
it is bisimilar to T5 by Lemma ICT] and one is guaranteed to exist 
by Lemma lC3l ii)). 

Construction of Eventually Recycling Pruning. The existence 
result in Lemma IC.3I is non-constructive and therefore does not 
yet yield decidability of verification even if the concrete transition 
system T5 is state-bounded. We next present Algorithm RCYCL, 
which is guaranteed to construct an eventually recycling pruning 
when its input DCDS is state-bounded, but which may diverge 
otherwise. 



Algorithm Rcycl 

Input: 5 = {V,V),a. DCDS with data layer V = {C ,11, £ ,To) 
and process layer V = {T, A, g). 

E — {lo}, ^:= 0, UsedValues := ADOM(Io), Visited := 
repeat 

pick state I € E, action a and legal parameters o 

such that (I, Q, a) ^ Visited 
RecyclahleValues :— UsedValues — (adom(Io) U ADOM(X)) 
pick set V of n service call results such that: 
|V| = n = |calls(do(X, a,a))| and 
\i \RecyclahleValues\ > n 
then V C Recyclable Values % recycled values 
else V C C — UsedValues % fresh values 

F ■- adom(Xo) U adom(2:) U V 

for each 9 G evalSf {X,a,a) such that I„ext \= £ 



where Inext 


— Do{I,a,a)edo 


E 


= EU{X„e.t} 




- ^u{(x,2:„e.t)} 


UsedValues 


= UsedValues U AD0M(I, 


Visited 


= VisitedU{{X,aa)} 


end 


until E and ^=> no longer change. 


return (C,7^,E,Xo,=^ 


>> 



Observe that algorithm RCYCL performs several nondeterminis- 
tic choices in each iteration. The particular choices (and their order) 
do not matter, by Theorem |5.4| 



Proof of Theorem |5.4| (Sketch) 

First, we show that algorithm RCYCL builds a pruning. Items (i) 
and (iii) in the definition of pruning are trivially satisfied in every 
run of Rcycl. Item (ii) follows from the following claim: 

Claim: for any choice of V such that |V| > 
[calls (do (X, a,a))\, the set of equality commitments 



represented by the successors of X generated by the eval- 
uations in EVALSf(/, a,o") coincides with the set of 
commitments represented by the successors of X in T5. 

Next, we show that if S is state-bounded, every run of RCYCL 
terminates. Indeed, state-boundedness guarantees that in each iter- 
ation, only at most b service call values are needed, where b is the 
state size bound. But after running "sufficiently" long, RCYCL vari- 
able UsedValues accumulates at least 36 distinct values. At each 
subsequent step of the algorithm, there will therefore exist at least 
b values distinct from the active domains of Xq and X, so the pick 
of V will always recycle values (observe that RCYCL only picks 
evaluations from set Bs). UsedValues will no longer change, and 
therefore E and ^=> must eventually saturate (a key reason for this 
is the bookkeeping of variable Visited, which avoids repeating the 
nondeterministic pick for any combination of state, action and pa- 
rameter instantiation (X, a, a)). 

Finally, since RCYCL terminates, then it outputs a finite-state 
pruning, which is trivially eventually recycling. D 

Theorem l5.4l and Theorem 13 .2 I directlv imply Theorem 15. 3 1 

C.4 GR-Acyclic DCDSs 



Proof of Theorem IS.SI For the proof, we reduce from the 
undecidable problem of checking if the run of a deterministic Tur- 
ing Machine is confined to a bounded-length segment of the tape 
(we say that the TM is tape-bounded). This in turn is undecidable 
by reduction from the halting problem: Given deterministic TM T, 
build TM T' such that T' is tape-bounded if and only if T halts. 
T' simulates T but also records on the tape the historical configura- 
tions of T At each step, T' checks if the most recent configuration 
of T was seen in the history. If so, T' stops simulating T and enters 
a loop in which it keeps extending the right end of its tape. It is 
easy to see that T' is tape-bounded if and only if T halts. 

We reuse without change the reduction exhibited in the proof of 
Theorem 14. 1 1 Recall that the reduction constructs for every Tur- 
ing Machine TM a DCDS with deterministic services 5 that sim- 
luates the computation of TM. We recall from the proof that the 
only service in the process layer, service newCell, is guaranteed 
to be called only with distinct arguments across distinct transitions, 
and so its behavior is unaffected by the choice of deteiTninistic ver- 
sus nondeterministic semantics. We also note that the state of the 
DCDS has size linear in the length of the tape segment visited by 
TM, so tape-boundedness reduces to state-boundedness. D 



Proof of Theorem |5.6I (Sketch). We prove the result by 
counting the maximum number of different values in a state of the 
transition system. 

Since this task is undecidable (by Theorem l5.5t . we necessarily 
have to approximate this value. The approximation is performed by 
analysing a different, much more abstract transition system we call 
dataflow transition system (to distinguish from the abstract system 
that is bisimilar to the concrete system). 

The dataflow system is a DCDS obtained as follows from the 
dataflow graph and Xo : For each node of the dataflow graph, there 
is a unary relation in the dataflow system, and for each normal (spe- 
cial) edge in the dataflow graph, there is a normal (special) tran- 
sition in the dataflow system between the corresponding relations. 
The schema of the dataflow system is a set of relation names with 
arity one, in correspondence to the nodes of the dataflow graph. A 
state of the dataflow system is an instantiation of its schema using 
values from the domain C. 

For each term t appearing in a relation in the initial state of the 
concrete system, there is a term t in the corresponding relation of 



the initial state of the dataflow system. Being in one state of the 
dataflow system, the next state is constructed as follows: 

• for each normal transition from a relation A to a relation B, 
for each term t in the relation A of the current state, there is a 
term t in the relation B of the next state. 

• for each special edge from a relation A to a relation B, for 
each term t in the node A of the current state, there is a fresh 
term t' in node B of the next state. 

It is easy to see the following claim: 

Claim 1. For any run r of length m > in the concrete 
system, there is a run t'^ of length m in the dataflow 
system, such that the size of the active domain of state 
r(i) is at most the size of the active domain of state 

As a result, any state bound for the dataflow system also bounds the 
state of the concrete system. We compute such a bound next. 

Consider the dataflow graph of A. GR-acyclicity forces cycles 
with special edges to not be connected to any other cycles in the 
dataflow graph. More specifically, each connected component of 
the dataflow graph must have one of the following types: 

A: A simple cycle C (possibly with special edges), possibly con- 
nected with several directed acyclic graphs (DAG)s, such that 
the component contains no additional cycle beyond C. 

B: Several cycles Ci, . . . ,C,n containing only normal edges, 
each Ci possibly connected to several DAGs, such that the 
component contains no other cycle beyond the Ci 's, and there 
is no path with special edges connecting two cycles Ci,Cj. 

C: A DAG, possibly containing normal and special edges. 

Denote with 

d: the longest path of the dataflow graph after deleting the cy- 
cles, 

b: the maximum number of special edges going out of a node of 
the dataflow graph plus one, and 

n: the number of nodes of the dataflow graph. 

It is easy to see that in each transition of the dataflow system, for 
each teiTn in the current state, there can be at most n ■ b distinct 
terms in the next state. 

First, consider the components of type A. Call the DAGs D con- 
nected to the unique cycle C via edges from D to C, input DAGs. 
Call output DAGs the DAGs connected via edges from C to D. It 
is easy to see that after d transition steps, in any run of the dataflow 
system there is no term in any relation of an input DAG (all have 
been forgotten), and at most 



D. DISCUSSION 



m ■— |ADOM(Xo)|+fl-&-|ADOM(Xo)| + - 



I did 

■ + n -b 



|adom(Io) 



distinct terms may co-exist within the relations of the cycle. More- 
over, after d steps, the total number of distinct terms in the cycle 
will no longer increase in any run suffix starting from step d + 1. 
Consider now how the m terms can be copied into the output DAGs. 
It is again easy to see that there can be at most n** ■ fe'* • m distinct 
terms in any relation of an outgoing DAG. As a result, in any step 
at most n'*^^ ■ b'^ ■ m distinct terms may co-exist within a type A 
component. 

Second, consider the components of type B. A similar argument 
yields at most n'*^^ • fe'* ■ m different terms that may co-exist within 
a type B component. 

Third, it is easy to see that there can be at most n'^ ■ b"^ ■ 
|adom(Io)| different terms within a type C component. 

All in all, at most |adom(Io)| • n^'*"'"^ • 6^'' distinct terms may 
co-exist in a state of the concrete transition system. D 



Proof of THE0REM |6. IK Sketch). The technical problem 
here is to force the results of nondeterministic service calls to con- 
form to historic evaluations. 

Let D be a deterministic DCDS. We rewrite D to obtain a new 
DCDS A^ whose semantics under nondeterministic services coin- 
cides with that of D under deterministic services. For each term 
/(tti , . . . ,a„) appearing in some effect of D e := g+ A Q~ -^ E, 
we rewrite D as follows. We extend the schema with a new 
n + 1-ary relation Rf. Intuitively, Rf{ai, . . . ,a„,r) states that 
the call /(ffli, . . . , an) evaluates to r. We extend the effect to 
record this fact, replacing e with : e' := q^ A Q~ —* E A 
Rf{ai, . . . ,a„, f{ai, . . . ,a„)). To ensure that Rf records all past 
calls of /, we add to each action an effect that simply copies Rf . 
We also add the functional dependency ai, . . . ,a„ — > r on Rf. 
Notice that any attempt to record a service call with a result dis- 
tinct from a past invocation violates the functional dependency and 
the transition does not occur It is easy to see that, if we project the 
states of Tjv on the schema of D, we obtain Tn- D 



Proof of Theorem |6.2| (Sketch). The challenge here lies 
in forcing a deterministic service fd to return possibly distinct re- 
sults for same-argument calls of the nondeterministic service /„ it 
corresponds to. 

The trick is to call fd with one additional argument, which plays 
the role of a timestamp, where each state in the run has its own 
unique timestamp. This way same-argument calls of /„ at distinct 
steps in the run correspond to distinct-argument calls of fd, which 
therefore simulates the desired nondeterministic behavior 

An additional trick is used to get the run to generate a sequence 
of unique values to act as timestamps. We use a deterministic ser- 
vice new{x) to generate a new timestamp, we record the successor 
relation over timestamps in binary relation succ, and the most re- 
cent timestamp in unary relation now. We add to each action 

• the effect 

now{x) -w now{new{x)) A succ{x, new{x)) 

which extends the successor relation by one timestamp and 
sets the new timestamp as most recent; and 

• the effect 

succ{x,y) ~~* succ{x,y) 
which accumulates the historical succ entries. 

We are not quite done, as we still need to ensure that succ induces 
a linear order on the collection of timestamps generated during 
the run. For this puipose we employ the same trick as the proof 
of Theorem 14.11 We describe it below for the sake of proof self- 
containment. 

Observe that, by definition of the effect extending succ, at each 
step, the generated timestamp has at most one successor 

However, if the call new(x) returns a previously seen timestamp, 
then there can be some timestamp with several predecessors in succ. 
We rule out this case by declaring the second component of succ 
to be a key. It follows that succ must be either (i) a linear path 
over timestamps (possibly starting from a source node that has a 
self-loop), or (ii) must contain a simple cycle involving more than 
one timestamp. The simple cycle is created when new returns the 
minimal element of the succ relation. 

We wish to force case (i). To rule out case (ii), we proceed as fol- 
lows: we initialize succ to contain a source node that can never 
be a timestamp because it cannot be returned by new without vio- 
lating the key constraint on succ. To this end, we initialize succ in 



Xo to succ^» = {(0,0), (0,1)} and now^° = {1}. Notice that 
succ''" has type (i). An easy induction shows that every run prefix 
must also construct a succ relation of type (i), since any attempt to 
extend succ with an edge back to one of its existing nodes violates 
the key constraint. It also follows easily that during the run, now 
takes values from the linear path starting at 1, and never includes 
0. D 

E. EXAMPLE DCDS: 

TRAVEL REIMBURSEMENT SYSTEM 

We model the process of reimbursing travel expenses in a univer- 
sity, and the corresponding audit system, in two different subsys- 
tems. In particular, the first subsystem, called the request system 
manges the submission of reimbursement requests by an employee, 
and preliminary inspection and approval of the request by an em- 
ployee in the accounting department (we shall call her the monitor). 
A log of accepted requests will be submitted to the second subsys- 
tem, the audit system, in which requests can be accumulated, and 
they can be checked for accuracy by calling external web services 
(for instance to obtain the exchange rate from foreign currency to 
USD on a past date, or to check that the employee actually was on 
the declared flight). 

Request system. To keep the example simple we model a travel 
reimbursement request as being associated to the name of the re- 
quester, and infor- mation related to the corresponding flight and 
hotel costs. After a request is submitted, a monitor will check the 
request and will decide to accept or reject the request. If a request is 
rejected, the employee needs to modify the information regarding 
hotel and flight, while employee name will not be changed while 
updating. After the update by the employee, the monitor will again 
check the request, and the reject-check loop continues until the 
monitor accepts the request. After a request is accepted a log of 
the request will be sent to the audit system, and the request system 
will be ready to process the next travel request. 

We model the request system by a DCDS Se, = {T>,V), in 
which T> — {C,TZ,£,Io) such that C is a countably infinite set 
of constants, Iq = {Status{'readyForRequest'),true}, and 7^ is a 
database schema as follows: 

• Status — (status) , a unary relation that keeps the state of the 
request subsystem, and can take three different values: 'ready- 
ForRequest' , 'readyToVerify', and 'readyToUpdate', 

• Travel = (eName), holding the name of the employee; 

• Hotel — (hName, date, price, currency, pricelnUSD), hold- 
ing the hotel cost information of the employee's travel, which 
might have been paid in some other currency than USD, 

• Flight — (date, fNum, price, currency, pricelnUSD), hold- 
ing the flight cost information. 

The process layer is defined as P = { J^, A, g) where J^ is a set 
of the following nondeterministic service calls, each modeling an 
input of an external value by the employee. Specifically, 

• INENameO : models the input of the employee name (filled 
in by the employee), 

• INHNameO : hotel name, 

• INHDateO : arrival date, 

• INHPriceO : sum paid to the hotel (possibly in foreign cur- 
rency), 

• INHCurrencyO : currency exchange rate at that date, 

• INHPInUSDO : amount paid to the hotel in USD, 



• INFDateO : flight date, 

• INFNumO : flight number, 

• INFPriceO : ticket price, possibly in foreign currency, 

• INFCurrencyO : currency exchange rate at date of ticket 
payment, 

• INFPUSDO : ticket price in USD. 
There is one additional service. 

• MAKEDecisionO : a nondeterministic service modeling the 
decision of the human monitor. It returns 'requestConfirmed' 
if the request is accepted, and returns 'readyToUpdate' if the 
request needs to be updated by the employee. 

The set A of actions includes (among others): 

InitiateRequest : 

true -^ Status{'readyToVerify') 

true ^ Travel(lNENAME()) 

true ^ Hotel (inHName(), 
INHDateO, 
INHPriceQ, 
inHCurrency(), 
INHPInUSDO) 

true ^ Flight(lNFDATE(), 

inFNum(), 

inFPrice(), 

INFCurrencyO, 

INFPUSDO) 

VerifyRequest: 

true ^ Status(MAKEDECISION()) 
Travel (ri) -^ Travel (n) 
Hotel(a;i, . . . , x^) -^ Hotel(a;i, . . . ,0:5) 
Flight(a;i, ... ,2:5) ^ Flight(a;i, . . . ,2:5) 

UpdateRequest: 

true -^ Status{' readyToVerify') 
Travel (n) -^ Travel (n) 

true ^ Hotel (inHName(), 
INHDate(), 
inHPriceO, 
INHCurrencyO, 
INHPInUSDO) 

true ^ Flight(lNFDATE(), 

inFNum(), 

inFPriceO, 

INFCurrencyO, 

INFPUSDO) 

AcceptRequest: 
Status('requestConfirmed') -^ Status{'readyForRequest') 



When a request is initiated (modeled by the action 
InitiateRequest), (i) the system changes state "to waiting for 
verification", (ii) a travel event is generated and the employee fills 
in his name, (iii) the employee fills in all hotel information, and 
(iv) the employee fills in all flight information. 

Action VerifyRequest models the preliminary check by the mon- 
itor. Travel event, hotel and flight information are unchanged. 



Travel 

Figure 9: Dataflow graph for request system 

but the system status is set by the non-deterministic service call 
MAKEDecision(), which models the monitor's decision for cur- 
rent active travel information. 

If the monitor rejects, then she sets the next state to 
'readyToUpdate' , which will trigger the action UpdateReqiiest, 
which in turn collects once again the hotel and flight information 
from the employee, and moves the status to 'readyToVerify' . 

Finally, action AcceptRequest returns the system in the state 
' ready ForRequest' , in which it is ready to accept a new request. 

Notice the use of the always true predicate true, with the evident 
meaning. A convenient way to model its meaning in the DCDS 
framework is to think of it as a nullary relation, initialized to con- 
tain the empty tuple, which is copied in perpetuity by each action 
(true never changes its interpretation). We omit the corresponding 
copy effects, treating them as built-in. 

Notice how the condition-action rules in the set q below guard 
the actions by the cuiTent state of the system: 



Status {' ready ForRequest') 

Status {'readyToVerify') 

Status {'readyToUpdate') 

Status {'requestConfirmed') 



InitiateRequest 
VerifyRequest 
UpdateReqiiest 
AcceptRequest 



The dataflow graph corresponding to the request system is de- 
picted in Figure |9l where special edges are starred. Notice that 
there can be multiple special edges between the same two nodes 
(these are distinguished by unique edge ids, which we omit in the 
figure to avoid clutter). 

In particular, the group of special edges from the true node to the 
Hotel node corresponds to the action of employee filling in the ho- 
tel information, modeled by calls to such services as INHName(). 
Similarly for the special edges from true to Flight. 

The special edge between true and Travel is due to the employee 
filling in his name into the created travel request. The special edge 
from true to Status reflects the monitor's insertion of her deci- 
sion (see the caU to MAKEDecision() in the first effect of action 
VerifyRequest in Example lEt, while the normal edge corresponds to 
change of the status without calling a service (this happens in other 
actions). The self-loops on Flight, Hotel, and Travel are due to the 
remaining (copy) effects of VerifyRequest. The self-loop on node 
true is due to the modeling of this value by a singleton nullary rela- 
tion containing the empty tuple, which keeps being copied in each 
action. 

An inspection of this dataflow graph reveals that the request sys- 
tem is not GR-acyclic, since it contains several instances of two 
simple cycles connected by a path that includes a special edge. For 
instance, the path tt comprised of the self-loops around true and 
Travel, and the special edge beetween them. However, the request 
system is GR^-acyclic. To illustrate this, notice that the path tx is al- 
lowed by GR^-acyclicity because the special edge leading into the 
Travel loop is caused by action InitiateRequest, while all the sub- 
sequent edges in n are caused by other actions (in this case there 
is only one subsequent edge in tt, namely the self-loop on Travel, 
caused by actions VerifyRequest and UpdateRequest). 

We illustrate some /i£p properties pertaining to the proper oper- 
ation of the request system: 



A property of interest is that once initiated, a request will even- 
tually be decided by the monitor, and the decision can only be 
'readyToUpdate' or 'requestConfirmed' (a liveness property). We 
show the property in the easier- to-read CTL syntactic sugar: 

AG (Vn Travel (n) -^ 

A{Jra\ie\{n)\J {Status^' readyToUpdate') V 
Status( 'requestConfirmed')) 

The until operator U (for this example, it is the strong flavor, in 
which ^\J(j> means that (j) is guaranteed to eventually hold, and un- 
til it does xp must hold in every step). We note that for a property to 
belong to ^Cp, it must require the bindings of quantified variables 
to be continuously live between the step when the quantification 
was evaluated and the step when the variable is used. This can be 
done by using LIVE or by using any relation, in our example Travel. 
The ^Cp version of the property is given below: 

!/X.(Vn.Travel(n) — )► 

pY.{StaX.us{'readyToUpdate') V Status{'requestConfirmed') 

V H(Travel(n) A y))) A H^ 

Another property of interest is that if the flight cost is not speci- 
fied, then the request is not accepted (a safety property). We use the 
special constant _L to denote a null value (this need not be treated 
specially in the semantics, any value of the domain can be reserved 
for this puipose): 

G-i( Status( 'requestConfirmed') A 

3xi, ... ,X4, Flight(xi,x2,±,a;3,a;4)). 

The fiCp version is given below: 

i^X.{^(Status( 'requestConfirmed') A 

3a;i, . . . ,a::4.Flight(a;i,a;2,-L,a;3,a;4))} A [—]X 

Audit system. After a request is verified by the monitor in the re- 
quest system, it will be migrated to the audit system. The migration 
is performed by a logging subsystem which might perform such 
operations as: we extend each travel event with a freshly generated 
travel id, which guarantees uniqueness across the entire history of 
requests. We store these tuples in a database. We can model this 
migration using the DCDS formalism, but we omit the specification 
and focus directly on the audit system. 

More specifically, we model the audit system by a DCDS Sa = 
{Da, Va), in which Va ~ {C, TZ, £,Io). C is a countably infinite 
set of constants. 7?, is a database schema as follows: 

• Status — (status) is a unary relation keeping the state of re- 
quest subsystem, which can take two different values: 'check- 
Price', and 'checkTravel', whose role is to sequence the ac- 
tions of the audit system appropriately. 

• Travel = (jd, eName, passed) extends the homonymous re- 
lation of the request system with two fields: id (the travel 
identifier), and passed, which will be set by the audit system 
to reflect whether both the hotel and the flight price checks 
succeed. 

• Hotel — (trid, hName, date, price, currency, pricelnUSD, 
passed), where trId is a foreign key to the travel id and passed 
is set by the audit system to reflect whether the claimed price 
and the calculated price match. 

• Flight = (trld,fNum, date, price, currency, pricelnUSD, 
passed), where trid and passed are analogous to the ones in 
the Hotel relation. 



Finally, lo is the output of the logging subsystem to which we 
add the fact Status( 'checkPrice'), to initialize the audit system sta- 
tus. 

The process layer is defined as P = (J^, A, q) in 
which F contains a deterministic service, where the call 
CONVERTANDCHECK(price, currency, date, pncelnUSD) per- 
foms the official exchange rate acquisition and computation de- 
scribed above, returning true if and only if the claimed price and 
the computed one match. 

A = {CheckPrice, CheckTravel} includes the following actions. 

CheckPrice : 
true -^ Status{'checkTravel') 



Travel (i 



Travel (i. n. v) 



Hotel(xi,X2, date, price, currency , pricelnUSD , X7) ~^' 
Hotel (xi, X2, date, price, currency, pricelnUSD , 

CONVERTAndCheck( (fate, price, currency , pricelnUSD)) 

Flight(a;i,a;2, date, price, currency , pricelnUSD , X7) -^ 
Flight(a;i,a;2, date, price, currency , pricelnUSD , 

CONVERTAndCheck( date, price, currency , pricelnUSD)) 

Notice that the first effect changes the audit system's state to enter 
the stage in which the the two checks (for hotel and flight) are 
combined. The second effect simply copies the request informa- 
tion. The third and fourth each check the claimed price (for hotel, 
respectively flight), performing the conversion described above. 

The second action works on the result of the first (this is ensured 
by the appropriate status changes). 



CheckTravel: 



true -^ Status{' checkPrice') 



Travel(2:i, X2, 2:3) A 

Hotel(xi,2/i ...,y5,ph) A 

Flight(a;i,2i .. .,Z7,Pf) A ^{ph /\Pf) 

Travel(xi, X2, X3) A 

Hotel (a;i,i/i . . . ,7/5, true) A 

Flight(a;i, zi . . . ,Z7, true) 



Travel(a;i, a;2, false) 



Travel (xi, a;2, true) 



Hotel(a;i, . . . , a;?) ^> Hotel(a;i, . . . , X7) 



Flight(a;i,. 



.,X7) 



Flight(a;i 



,X7) 



Notice that the second and third effects set the passed field 
for the request, computed as the conjunction of the corresponding 
fields set by the price check on flight and hotel. 

The process g is defined as follows: 



Status( 'checkPrice ') 
Status( ' CheckTravel') 



CheckPrice 
CheckTravel 



The corresponding dependency graph is as shown in Figure [TO] 
In this picture nodes correspond to the positions of the schema. To 



avoid clutter, we represent each relation by its first letter, and denote 
the position number with a subscript. For instance, Ti stands for 
the first (id) position of the relation Travel, and S stands for the 
only position of the relation Status. Moreover, the edges without 
label represent regular edges in the dependency graph, while the 
starred edges depict special edges. For instance, the edge F5 — > 
Fr is introduced due to the fourth effect of action CheckPrice. It is 
starred because it reflects the service call of CONVERTAndCheck, 
taking as argument the currency attribute of Flight (at position 5), 
and storing its result in an Flight tuple at position 7 (the passed 
attribute). 

An inspection of the dependency graph reveals that the audit sys- 
tem is weakly acyclic, since there is no cycle including a special 
edge. 




Figure 10: Weakly-acyclic dependency graph of the audit sys- 
tem 

We illustrate a desirable property of the audit system: it guar- 
antees that the request cannot pass the audit if one of the flight or 
hotel checks fail: 

A G{3i, n,v,X2, ■ ■ ■ , a;6.Travel(i, n, v) A 

(Hotel(i, X2, ■ ■ ■ , xe, false) V Flight(i, X2, ■ ■ ■ , a;6, false)) 
—5- F Travel(i, n, false)) 

The ^iLa version of the property is given below: 

i'X.{3i, n,v,X2, ■ ■ ■ , a::6 -Travel (i, n, v) A 

(Hotel(i, X2, ■ . ■ ,xe, false) V Flight(i, X2, ■ ■ ■ , a^e, false)) 
-^ ^F.(Travel(J, n, false) V {-)¥)) A [-]X 

Notice that, since the audit system uses deterministic services, 
if we wish to verify it in isolation from the other subsystems, we 
can verify an /iCa property, which is what the above is (we are not 
enforcing the liveness of the variables i,v,n between the step at 
which the quantification was evaluated, and the eventual step when 
the passed attribute of Travel is set to false). 

Recall however from Section|6]that we can verify mixed seman- 
tics DCDS by reduction to non-deterministic services. If we wished 
to verify the above property over the collection of subsystems, we 



would have to express it as an ^Cp property. This is easily done us- 
ing an until operator U (as illustrated above for the request system). 
Moreover, it is actually compatible with our expectation about the 
system's operation: while a request is being audited, we expect it 
to persist in the system. 



